Meetup #7: Machine Learning and Music


Details
Music is part of our everyday‘s life. Besides its importance in society, it is a vibrant research field and a playground for technological trends for a variety of tasks (e.g. tempo estimation, music synchronization or retrieval). In this meetup, we introduce a selection of these tasks and approaches using recent machine learning concepts, such as deep learning, applied to the interdisciplinary domain of music.
Talk 1: Musical Tempo Estimation with Convolutional Neural Networks
Hendrik Schreiber, International Audio Laboratories Erlangen (Fraunhofer IIS/Universität Erlangen-Nürnberg)
Global musical tempo estimation is a well established task in music information retrieval (MIR) research. While traditional systems typically first identify onsets or beats and then derive a tempo, our proposed system estimates the tempo directly from a conventional Mel-spectrogram. This is achieved by using a single convolutional neural network (CNN). The CNN approach performs as well as or better than other state-of-the-art algorithms. Especially the exact estimation of tempo without tempo octave confusion is significantly improved. Furthermore, the same approach can be used for local tempo estimation. The talk focuses on how to design such a network by drawing inspiration from a traditional signal-processing-based approach and translating it into a fully convolutional neural network (FCN).
Hendrik Schreiber is a Ph.D. candidate at International Audio Laboratories Erlangen working with Meinard Müller. His research focuses mostly on MIR applications such as audio fingerprinting, segmentation, key detection, and tempo estimation using digital signal processing and machine learning techniques. Being self-employed, he is also the author of "beaTunes", a software which allows music enthusiasts to put his algorithms to work in real life.
Talk 2: Cross-modal Retrieval in Music with Embedding Space Learning
Dr.-Ing. Stefan Balke, pmOne Analytics
Connecting large libraries of digitized audio recordings to their corresponding sheet music images has long been a motivation for researchers to develop new cross-modal retrieval systems. In recent years, retrieval systems based on embedding space learning with deep neural networks got a step closer to fulfilling this vision. In this talk, we give a brief introduction to the task of cross-modal retrieval in music and the applied learning techniques. Furthermore, we present a recently published extension based on an additional softmax attention mechanism which helps to cope with tempo deviations in the music signals.
Stefan Balke studied electrical engineering at the Leibniz Universität Hannover, in 2013. In early 2018, he completed his PhD in the Semantic Audio Signal Processing Group at the International Audio Laboratories Erlangen. Afterwards, he joined the Institute of Computational Perception at the Johannes Kepler University Linz as a PostDoc. His research interests include music information retrieval, machine learning, and multimedia retrieval. Since 2019, he is working as a data scientist at pmOne in Paderborn.
Talk 3: Bacher than Bach? On Musicologically Informed AI-based Bach Chorale Harmonization
Alexander Leemhuis
Writing chorales in the style of Bach has been a music theory exercise for generations of music students. As such it is not surprising that automatic Bach chorale harmonization has been a topic in music technology for decades. We present a Deep Learning solution based on musicological insights into human choral composition practices. Evaluations with expert listeners show that the generated chorales closely resemble Bach's harmonization style.
Alexander Leemhuis started his Tonmeister studies at the Detmold University of Music in 2015. Since 2016, he works as a student assistant at the “Center of Music and Film Informatics” where he finished his Bachelor’s thesis on AI-based Bach chorale harmonization in 2019. Currently, he studies music and maths for a teaching profession.

Meetup #7: Machine Learning and Music