Past Meetup

Machine Learning Advances in Speech Synthesis

This Meetup is past

24 people went

estuari

333 Broadway · Troy, NY

How to find us

Estuari Coworking space at One Hudson Ventures. It is located in the same building as the Tech Valley Center of Gravity, on the 3rd floor. Enter on Broadway side at door near Nibble Inc to take elevator up.

Location image of event venue

Details

Welcome back from summer! Join us for the 1st meetup of the fall to discuss recent advances in speech synthesis (artificial generation of human speech) using machine learning. I'll be leading an interactive exploration of the most popular speech and audio synthesis technologies and approaches with demos of them in action and details of how they work and compare to each other and previous work in the field.

Some of the projects we'll review are:

- Apple's Siri deep learning voice generation (https://machinelearning.apple.com/2017/08/06/siri-voices.html)
- Google Tacotron (https://google.github.io/tacotron/)
- Deepmind WaveNet (https://deepmind.com/blog/wavenet-generative-model-raw-audio/)
- Baidu Deep Voice (http://research.baidu.com/deep-voice-production-quality-text-speech-system-constructed-entirely-deep-neural-networks/)
- Baidu Deep Voice 2 (http://research.baidu.com/deep-voice-2-multi-speaker-neural-text-speech/)

We'll also dabble in how to use TensorFlow and PyTorch to explore these approaches.

As time allows we will also share experiences with commercial offerings like Amazon Polly text-to-speech service (https://aws.amazon.com/polly/) and IBM Watson Text to Speech service (https://www.ibm.com/watson/services/text-to-speech/).

We will also spend a few moments discussing plans for future meetups and get input from everyone on the most desired focus areas.

We thank Matthew Lean and OneHudson Ventures for letting us use the Estuari coworking space.