Skip to content

What is a language model… what is it not? What is speech... what can it be?

Photo of Robert (Munro) Monarch
Hosted By
Robert (Munro) M.
What is a language model… what is it not? What is speech... what can it be?

Details

Lily Clifford, the CEO and founder of Rime (rime.ai) will join us at Bay Area NLP and bring a voice that is almost unique among the large amount of media today about large language models: the voice of expertise.

Lily is has studied in the world's leading labs for both machine learning and sociolinguistic variation while also grounding that expertise with the realities of shipping real-world products.

Summary

Through architectural innovation and sheer size, language models have become increasingly capable and are increasingly productized. They are bigger than ever, including in the public imagination. The term 'language model' is as misleading as it is compelling. ELMo, BERT, and GPT-like architectures invariantly operate over a derivative of language which is itself technological: text. For many years text has operated, culturally (and therefore technologically) at the level of metonymy, often standing as a substitute for language itself. What are the alternatives?

In this talk, I compare hypothetical 'large text models' and 'large speech models' and the particular technological and commercial opportunities that each present. Speech, although arguably also a fundamentally technological externalization of language, contains a density of information about who is saying and what is being said that makes text look hollow in comparison. The difficulty in modeling this rich density of meanings is certainly one reason why generative text has outpaced generative speech tooling in the last decade.

I will give a high-level overview of what we’re building at Rime: text-to-speech technologies that make the compelling diversity of sociolinguistic variation within and across languages available to users.

Bio:

Lily is a co-founder and CEO of Rime, focused on building highly flexible text-to-speech tooling for customer-facing enterprise technology use cases (democratized contact centers and personalized advertising). Prior to founding Rime, she was a PhD student in Stanford's NLP group, where her research focused on the intersection of computational speech processing and sociophonetic variation.

Photo of Bay Area NLP (Natural Language Processing) group
Bay Area NLP (Natural Language Processing)
See more events