Teaching Computers to Read Music - Jan Hajič
Optical Music Recognition (OMR) is a field of research that attempts to computationally read music notation. Its users range from librarians and musicologists to active musicians and composers. There are several reasons why OMR is a difficult problem that defies the analogy to its much more mature cousin, OCR: mainly the featural nature of music notation itself, which is in principle distinct from all systems used to graphically capture natural languages. Furthermore, there is the expectation that OMR will produce not only a logical description of the music notation document itself, but that it also infers the musical semantics encoded by the music notation.
Machine learning — and specifically deep learning techniques developed in computer vision — is a natural fit for dealing with many of these complexities, especially with respect to the input. In this talk, I will present significant recent contributions to OMR — both with respect to underlying work that makes it possible to formulate OMR as a machine learning task, and to the machine learning aspects themselves.
Introduction to Speech Processing for Voice Assistants - Ondřej Plátek
What are the speech processing tasks needed for spoken voice assistants? In the talk, we will review tasks like incremental ASR, voice-activity detection, end-pointing, speaker recognition, diarization, beam-forming, LM modeling, inverse-text-normalization.
The talk should give you an introduction to the field of speech processing by introducing zoo of tasks. The machine-learning tasks will be motivated and introduced by their role in voice assistants.
We will briefly cover the latest architectures for some of the tasks but we will focus on what the state-of-the-art results mean for our human high expectations about speech understanding.
- 17:45 - 18:00 - Your arrival
- 18:00 - 18:40 - Teaching Computers to Read Music
- 18:40 - 18:50 - Short break
- 18:50 - 19:30 - Introduction to Speech Processing for Voice Assistants
- 19:30 - 22:00 - Networking in Bitcoin Coffee
Jan Hajič jr. is a doctoral student at the Institute of Formal and Applied Linguistics, a part of Faculty of Mathematics and Physics of the Charles University. He has spent the last four years trying to teach computers how to read music, in a dissertation topic called Optical Music Recognition, because he was annoyed by having to transcribe music manually during his previous composition studies at the Janáček Academy of Music and Performing Arts. As a part of this quest, he has become an active member of the International Society for Music Information Retrieval (ISMIR): presenting at the last three ISMIR conferences, organizing the WoRMS workshop on systems for that read music, and collaborating on this topic with researchers from Austria, Spain, or Canada. Besides work on OMR, he has also led the Charles University music generation team in collaboration with Neuron Soundware, with results presented at Microsoft’s DOTS 2018 event, and he leads several student theses on the topic of music processing at the faculty. He also actively plays the harpsichord, which he now studies at the Academy of Early Music in Brno with Monika Knoblochová, and other keyboard instruments, regularly performing with students and graduates of the Prague Conservatory and more.
Ondrej Platek is a developer and a researcher fascinated by machine learning approaches for natural language processing.
In the last six years, he helped to improve several speech recognition and dialogue systems.
During his Master and Ph.D. studies, he contributed to open-source toolkits Kaldi and Alex SDS.
He interned twice in Apple Siri team and spent last two years in a Czech hardware startup Angee.
During summer 2018 he decided to focus on his newly founded Oplatai company which provides consulting on speech & audio processing.