Sound source separation for musical pedagogy

Keywords: Machine learning, Sound source separation, Sheet music generation, Web application

Abstract

Harmonics hopes to support musical pedagogy, offering a concrete product with which those interested in learning to play an instrument can practice. We trained a model to identify and isolate the singular tracks of a song through TensorFlow and tools to make the separation of auditory sources and produce genuine sheet music, based on a musical transcription algorithm (specifically for pianos, basses, drums, and voice) that beginners can visualize, edit, and download (in .PDF and .MIDI formats), adjusting at their own pace. Three methods of source separation were considered, under the following restrictions: Use a single song as an input file, which it was moderately complex (composed of a set of between three and six instruments), and that the number of samples -songs composed by relevant instruments and tracks of each standalone instrument - suitable for model training, would be extremely scarce.

References

Agile Alliance. (2021). Kanban. https://www.agilealliance.org/glossary/kanban/

Byrne, R. W., & Russon, A. E. (1998). Learning by imitation: A hierarchical approach. Behavioral and Brain Sciences, 21(5), 667–684. https://doi.org/10.1017/S0140525X98001745

Cano, E., FitzGerald, D., Liutkus, A., Plumbley, M. D., & Stoter, F.-R. (2019). Musical Source Separation: An Introduction. IEEE Signal Processing Magazine, 36(1), 31–40. https://doi.org/10.1109/MSP.2018.2874719

Duan, Z., Mysore, G. J., & Smaragdis, P. (2012). Online PLCA for Real-Time Semi-supervised Source Separation. In F. Theis, A. Cichocki, A. Yeredor, & M. Zibulevsky (Eds.), Latent Variable Analysis and Signal Separation. LVA/ICA 2012. Lecture Notes in Computer Science (pp. 34–41). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28551-6_5

Encode OSS. (2021). Django REST framework. https://www.django-rest-framework.org

Gao, B., Woo, W. L., & Dlay, S. S. (2008). Single channel audio source separation. WSEAS Transactions on Signal Processing, 4(4), 173–182.

Gómez, E., Grachten, M., Hanjalic, A., Janer, J., Jordà, S., Julià, C. F., Liem, C., Martorell, A., Schedl, M., & Widmer, G. (2013). PHENICX: Performances as Highly Enriched aNd Interactive Concert Experiences. SMAC Stockholm Music Acoustics Conference 2013 and SMC Sound and Music Computing Conference 2013, 1–8.

He, P., She, T., Li, W., & Yuan, W. (2018). Single channel blind source separation on the instantaneous mixed signal of multiple dynamic sources. Mechanical Systems and Signal Processing, 113, 22–35. https://doi.org/10.1016/j.ymssp.2017.04.004

Hennequin, R., Khlif, A., Voituret, F., & Moussallam, M. (2020). Spleeter: a fast and efficient music source separation tool with pre-trained models. Journal of Open Source Software, 5(50), 2154. https://doi.org/10.21105/joss.02154

Jansson, A., Humphrey, E., Montecchio, N., Bittner, R., Kumar, A., & Weyde, T. (2017). Singing voice separation with deep U-Net convolutional networks. Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017, 745–751.

Jouny, I. (2007). Improving music algorithm using BSS. 2007 IEEE Antennas and Propagation Society International Symposium, 5267–5270. https://doi.org/10.1109/APS.2007.4396735

Lebler, D. (2008). Popular music pedagogy: peer learning in practice. Music Education Research, 10(2), 193–213. https://doi.org/10.1080/14613800802079056

Naik, G. R., & Wang, W. (2014). Blind Source Separation. Advances in Theory, Algorithms and Applications (G. R. Naik & W. Wang (eds.)). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-55016-4

Pepino, L., & Bender, L. (2018). Separación de fuentes musicales mediante redes neuronales convolucionales con múltiples decodificadores. IV Jornadas JAAS 2018, 1–7.

Peretz, I. (2006). The nature of music from a biological perspective. Cognition, 100(1), 1–32. https://doi.org/10.1016/j.cognition.2005.11.004

Rafii, Z., Liutkus, A., Stöter, F.-R., Mimilakis, S. I., & Bittner, R. (2017). The MUSDB18 corpus for music separation. https://doi.org/10.5281/zenodo.1117372

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science (pp. 234–241). Springer, Cham. https://doi.org/10.1007/978-3-319-24574-4_28

Roulston, K., Jutras, P., & Kim, S. J. (2015). Adult perspectives of learning musical instruments. International Journal of Music Education, 33(3), 325–335. https://doi.org/10.1177/0255761415584291

Salamon, J., & Gomez, E. (2012). Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6), 1759–1770. https://doi.org/10.1109/TASL.2012.2188515

Shinde, P. P., & Shah, S. (2018). A Review of Machine Learning and Deep Learning Applications. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), 1–6. https://doi.org/10.1109/ICCUBEA.2018.8697857

Wytock, M., & Kolter, J. (2014). Contextually Supervised Source Separation with Application to Energy Disaggregation. Proceedings of the AAAI Conference on Artificial Intelligence, 486–492. https://ojs.aaai.org/index.php/AAAI/article/view/8769

Yu, X., Hu, D., & Xu, J. (2013). Blind Source Separation: Theory and Applications. John Wiley & Sons, Inc.
How to Cite
Lancheros-Molano, R. D., Triana-Perez, J. S., Castañeda-Chaparro, J. F., Gutiérrez-Naranjo, F. A., & Rueda-Olarte, A. del P. (2021). Sound source separation for musical pedagogy. Revista Colombiana De Computación, 22(1), 22–33. https://doi.org/10.29375/25392115.4151

Downloads

Download data is not yet available.
Published
2021-06-01
Section
Article of scientific and technological research

Altmetric

Escanea para compartir
QR Code

Some similar items: