Skip to content

Maximize Voice-Agent (TTS) Performance + Auto-Profile/Optimize PyTorch/CUDA Code

Photo of Chris Fregly
Hosted By
Chris F.
Maximize Voice-Agent (TTS) Performance + Auto-Profile/Optimize PyTorch/CUDA Code

Details

Zoom link: https://us02web.zoom.us/j/82308186562

Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje Barth

Talk #1: Everything You Need to Know About Reducing Voice-Agent Latency (by Philip Kiely @ Baseten)
Rolling your own optimized voice agent introduces hard problems at each layer of the stack. In this talk, Philip will provide an overview of the runtime optimizations, infrastructure setup, and client code required to get consistently low latencies for voice at scale.

Talk #2: PyTorch Profiling That Actually Tells You What to Fix (by Emilio Andere @ Herdora)
Automate PyTorch profiler analysis by tracing bottlenecks to root causes including kernel memory patterns, tensor layouts, missing fusions - mapping them to specific code fixes.

Talk #3: Auto-Optimizing PyTorch and CUDA Code (by Chris Fregly)
Automate PyTorch and CUDA performance optimizations for all environments including GPUs.

Zoom link: https://us02web.zoom.us/j/82308186562

Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm

Photo of AI Performance Engineering Meetup (San Francisco2) group
AI Performance Engineering Meetup (San Francisco2)
See more events

Every 3rd Monday of the month until November 26, 2025

Online event
This event has passed