Audio Query-by-Example via Unsupervised Embeddings
Details
This month, John Vinyard will cover a query-by-example audio search engine. Semantic embeddings of short audio segments are learned by pushing audio segments that occur near in time close together in the embedding space. Embeddings are also constrained to be nearby even when various deformations are applied (time stretch, pitch shift, and additive noise), mirroring data augmentation techniques used in image recognition. This technique for learning unsupervised embeddings is inspired by, and similar to techniques described in the paper Unsupervised Learning of Semantic Audio Representations. Finally, random projections are used to hash the dense embeddings into bit strings allowing for fast, approximate k-nearest-neighbors searches using bitwise hamming distance.
You can check out an implementation at: https://cochlea.xyz/
Agenda:
After the talk, we'll head over to CU29 (across the street from the Omni) for networking and deep learning discussions.
Metered parking is free in downtown after 6pm on Tuesdays.
6:30 - 7:00: Sign In/Grab some food/Announcements
7:00 - 8:00: John's Talk
8:00: Head on over to CU29
About our host:
Capital Factory (https://capitalfactory.com/) is the entrepreneurial center of gravity in Austin, Texas. Located in the middle of downtown, Capital Factory has more than 60,000 square feet full of startups and entrepreneurs. Take classes to learn the skills that startups need, attend meetups to find a co-founder, rent a desk for your startup or apply for funding and mentorship in the incubator and accelerator.
