DataTalks #25: Productizing DS w/Kubeflow & Kale + Translating Musical Genres

Name: DataTalks #25: Productizing DS w/Kubeflow & Kale + Translating Musical Genres
Start: 2020-06-02T18:00:00+03:00
End: 2020-06-02T20:00:00+03:00

Hosted by Shay Palachy A.

Super Organizer

DataHack - Data Science, Machine Learning & Statistics

Details

DataTalks #25: Productizing DS w/ Kubeflow + Translating Music Genres

Our 25th DataTalks meetup is done in cooperation with Team8, and will be held online. We'll talk about migrating Data Science research
to production using Kubernetes, Kubeflow pipelines & Kale, and translation of music genres and voices.

𝗭𝗼𝗼𝗺 𝗹𝗶𝗻𝗸: https://zoom.us/j/92086149175?pwd=ZzZ1dmlmaERLYlphUC9jVGFjSlU1QT09

𝗔𝗴𝗲𝗻𝗱𝗮:
🔵 18:00 - 18:05 - Introduction - Tom Sela, Director of Research at Team8
🔶 18 :05 - 18:50 - Migrating Data Science Research to Production Using Kubernetes, Kubeflow Pipelines and Kale - Amit Ripshtos, Senior Software Engineer at noogata
🔴 18:50 - 19:35 - Translation of Music Genres and Voices - Adam Polyak, AI Research Engineer at Facebook

---------------------

𝗠𝗶𝗴𝗿𝗮𝘁𝗶𝗻𝗴 𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝗰𝗲 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝘁𝗼 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗨𝘀𝗶𝗻𝗴 𝗞𝘂𝗯𝗲𝗿𝗻𝗲𝘁𝗲𝘀, 𝗞𝘂𝗯𝗲𝗳𝗹𝗼𝘄 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲𝘀 𝗮𝗻𝗱 𝗞𝗮𝗹𝗲 - 𝗔𝗺𝗶𝘁 𝗥𝗶𝗽𝘀𝗵𝘁𝗼𝘀, 𝗦𝗲𝗻𝗶𝗼𝗿 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿 𝗮𝘁 𝗻𝗼𝗼𝗴𝗮𝘁𝗮

At noogata, a crucial part of delivering an end-to-end data science / AI based solution involves research and migrating to a finished product. We found out quickly that the researcher's results are not suitable for production use out-of-the-box. In order to 'productize' a solution, a data-processing pipeline needs to be established, taking research code from Jupyter notebooks and arranging it in a processing pipeline we can easily repeat and reproduce.

Our main technologies around our work are Kubernetes (container orchestration infrastructure) and Kubeflow pipelines (Workflow engine that runs natively on Kubernetes), as our data pipelines (Various ETLs, model training) need to scale up and down quite a bit, and we want things to work on any infrastructure.

In this talk, I will describe the process we built for 'productionizing' research projects using Kubernetes, Kubeflow pipelines and tools like Kale. I will explain how we use Kubeflow pipelines in the research phase and how it works for production as well, and I will deep dive into its benefits and disadvantages.

---------------------

𝗧𝗿𝗮𝗻𝘀𝗹𝗮𝘁𝗶𝗼𝗻 𝗼𝗳 𝗠𝘂𝘀𝗶𝗰 𝗚𝗲𝗻𝗿𝗲𝘀 𝗮𝗻𝗱 𝗩𝗼𝗶𝗰𝗲𝘀 - 𝗔𝗱𝗮𝗺 𝗣𝗼𝗹𝘆𝗮𝗸, 𝗔𝗜 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿 𝗮𝘁 𝗙𝗮𝗰𝗲𝗯𝗼𝗼𝗸

In this talk, we will present two methods:
i) A method for translating music across musical instruments and styles.
ii) A wav-to-wav method for converting between speakers’ voices, without relying on text.

The first is based on unsupervised training of a multi-domain WaveNet autoencoder that is trained end-to-end on waveforms. Employing a diverse training dataset and large net capacity, allows us to translate also from musical domains that were not seen during training.

The second method is based on an encoder-decoder architecture, where the encoder is pre-trained for the task of Automatic Speech Recognition (ASR), and a multi-speaker waveform decoder trained to reconstruct the original signal. The modularity of our approach, which separates the target voice generation from the Text To Speech (TTS) module, enables the customization of existing TTS services.

---------------------

𝗭𝗼𝗼𝗺 𝗹𝗶𝗻𝗸: https://zoom.us/j/92086149175?pwd=ZzZ1dmlmaERLYlphUC9jVGFjSlU1QT09

DataTalks #25: Productizing DS w/Kubeflow & Kale + Translating Musical Genres

DataHack - Data Science, Machine Learning & Statistics

Details

DataTalks #25: Productizing DS w/ Kubeflow + Translating Music Genres

You may also like