Maximize Voice-Agent (TTS) Performance + Auto-Profile/Optimize PyTorch/CUDA Code


Details
Zoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje Barth
Talk #1: Everything You Need to Know About Reducing Voice-Agent Latency (by Philip Kiely @ Baseten)
Rolling your own optimized voice agent introduces hard problems at each layer of the stack. In this talk, Philip will provide an overview of the runtime optimizations, infrastructure setup, and client code required to get consistently low latencies for voice at scale.
Talk #2: PyTorch Profiling That Actually Tells You What to Fix (by Emilio Andere @ Herdora)
Automate PyTorch profiler analysis by tracing bottlenecks to root causes including kernel memory patterns, tensor layouts, missing fusions - mapping them to specific code fixes.
Talk #3: Auto-Optimizing PyTorch and CUDA Code (by Chris Fregly)
Automate PyTorch and CUDA performance optimizations for all environments including GPUs.
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm

Every 3rd Monday of the month until November 26, 2025
Maximize Voice-Agent (TTS) Performance + Auto-Profile/Optimize PyTorch/CUDA Code