Name: Maximize Voice-Agent (TTS) Performance + Auto-Profile/Optimize PyTorch/CUDA Code
Start: 2025-08-18T09:00:00-07:00
End: 2025-08-18T10:00:00-07:00

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Talk #0: Introductions and Meetup Updates**
by Chris Fregly and Antje Barth

**Talk #1: Everything You Need to Know About Reducing Voice-Agent Latency (by Philip Kiely @ Baseten)**
Rolling your own optimized voice agent introduces hard problems at each layer of the stack. In this talk, Philip will provide an overview of the runtime optimizations, infrastructure setup, and client code required to get consistently low latencies for voice at scale.

**Talk #2: PyTorch Profiling That Actually Tells You What to Fix (by Emilio Andere @ Herdora)**
Automate PyTorch profiler analysis by tracing bottlenecks to root causes including kernel memory patterns, tensor layouts, missing fusions - mapping them to specific code fixes.

**Talk #3: Auto-Optimizing PyTorch and CUDA Code (by Chris Fregly)**
Automate PyTorch and CUDA performance optimizations for all environments including GPUs.

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Related Links**
Github Repo: [http://github.com/cfregly/ai-performance-engineering/](http://github.com/cfregly/ai-performance-engineering/)
O'Reilly Book: [https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/](https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/)
YouTube: [https://www.youtube.com/@AIPerformanceEngineering](https://www.youtube.com/@AIPerformanceEngineering)
Generative AI Free Course on DeepLearning.ai: [https://bit.ly/gllm](https://bit.ly/gllm)

Chris Fregly

AI Performance Engineering Meetup (San Francisco2)

Technology

Deep Learning

Artificial Intelligence Applications

Neural Networks

Data Analytics

Artificial Intelligence

Data Science

Machine Learning

Predictive Analytics

TensorFlow

Machine Intelligence

Healthcare Innovation

CUDA: Compute Unified Device Architecture

Kubernetes

Every 3rd Monday of the month until November 26, 2025

Ms Thai

Henry Kendall

Renat

Bret

Carl Koster

Sagar Naik

Maximize Voice-Agent (TTS) Performance + Auto-Profile/Optimize PyTorch/CUDA Code

Online event

Share

AI Performance Engineering Meetup (San Francisco2)

Maximize Voice-Agent (TTS) Performance + Auto-Profile/Optimize PyTorch/CUDA Code

AI Performance Engineering Meetup (San Francisco2)

Details

Members are also interested in