Name: GPU, CUDA, and PyTorch Performance Optimizations
Start: 2026-04-20T09:00:00-07:00
End: 2026-04-20T10:00:00-07:00

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Talk #0: Introductions and Meetup Updates**
by Chris Fregly and Antje Barth

**Talk #1: Optimizing AI Inference for Heterogeneous Clusters by Natalie Serrino, Founder @ Gimlet Labs**
This talk will cover the performance benefits and technical challenges of deploying inference workloads across heterogeneous hardware. It's a good fit for agents because agents are inherently heterogeneous, and combining GPUs with SRAM-centric architectures leads to major speedups for the same power envelope. But you also have to figure out how to slice workloads and orchestrate across all of this hardware, make the hardware talk to each other, and develop performant code for each target platform.

Speaker: Natalie Serrino, Founder @ Gimlet Labs (https://www.linkedin.com/in/natalieserrino/ @ https://gimletlabs.ai/)

**Talk #2: GPU, PyTorch, and CUDA Performance Optimizations**

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Related Links**
Github Repo: [http://github.com/cfregly/ai-performance-engineering/](http://github.com/cfregly/ai-performance-engineering/)
O'Reilly Book: [https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/](https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/)
YouTube: [https://www.youtube.com/@AIPerformanceEngineering](https://www.youtube.com/@AIPerformanceEngineering)
Generative AI Free Course on DeepLearning.ai: [https://bit.ly/gllm](https://bit.ly/gllm)

Chris Fregly

Antje Barth

AI Performance Engineering Meetup (San Francisco, Global)

Technology

Predictive Analytics

High Scalability Computing

Artificial Intelligence Programming

Artificial Intelligence Applications

Apache Spark

Data Science

Machine Learning

TensorFlow

Big Data

Deep Learning

Neural Networks

Artificial Intelligence

Natural Language Processing

Kubernetes

PyTorch

Every 3rd Monday of the month until December 31, 2026

GPU, CUDA, and PyTorch Performance Optimizations

Online event

Share

AI Performance Engineering Meetup (San Francisco, Global)

GPU, CUDA, and PyTorch Performance Optimizations

AI Performance Engineering Meetup (San Francisco, Global)

Details

You may also like