Skip to content

Munich🥨NLP - Online: How Phi Silica Enables Efficient On-Device Paraphrasing

Photo of Munich NLP
Hosted By
Munich N. and Katya A.
Munich🥨NLP - Online: How Phi Silica Enables Efficient On-Device Paraphrasing

Details

In this talk, Marat Saidov will discuss the advances in building efficient on-device language models optimized for NPUs, highlighting techniques such as memory-mapped embeddings, KV caching, 4-bit quantization and speculative decoding. A key focus is Rewrite, Microsoft’s publicly available paraphrasing skill, covering comprehensive data collection strategies, carefully designed evaluation metrics utilizing LLM-as-a-judge, and various adapters such as LoRA. He will also highlight the role of system prompts and soft prompts, highlighting their effectiveness and competitiveness compared to LoRA. He will share insights on deploying compact models at scale, practical lessons learned, and future challenges we face.

Marat Saidov is a Senior Software Engineer at Applied Sciences Group, Microsoft. Based in Belgrade, Serbia. Previously improved Speech Recognition and Natural Language Understanding services at Alice Voice Assistant, Yandex. Besides that, I was an NLP Research Assistant at HSE University, Russia.

More info:
https://munich-nlp.com/events/marat-saidov-phi/

Photo of Munich🥨NLP group
Munich🥨NLP
See more events
Online event
Link visible for attendees
FREE