
What we’re about
This meetup is focused on AI Performance Engineering.
Upcoming events
12

Nvidia Nsight GPU Profiling +KV Cache Efficiency +Context "Platform" Engineering
Location not specified yetZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje Barth
Best Selling O'Reilly book, "AI Systems Performance Engineering" is now available (eBook and physical!), 1000 pages, 200 figures, 700 examples!!!
Amazon: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
GitHub: https://github.com/cfregly/ai-performance-engineering
Talk #1: Diving deep into NVIDIA Nsight Systems GPU profiling tools for PyTorch LLM and computer vision workloads by Chaim Rand
In this talk, Chaim Rand (repeat speaker on this webinar series!) revisits the NVIDIA Nsight profiling tools to augment the PyTorch Profiler for LLM and vision workloads. This talk is based on Chaim's recent blog posts on Optimizing Data Transfer in AI/ML Workloads part 1 and part 2.
Talk #2: KV Cache Efficiency + Context "Platform" Engineering by Valentin Vercovici and Callan Fox (WekaIO)
This presentation will include demos and code with a focus on improving KV-cache hit rates as well as introducing a methodology called Context "Platform" Engineering to design and optimize AI infrastructure for Agent Swarm Context at scale. Context Platform Engineering was recantly featured in the CES2026 keynote by Jensen Huang, CEO of NVIDIA. This presentation is related to a recent AIE CODE Summit talk in December 2025.
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm14 attendees
GPU, CUDA, and PyTorch Performance Optimizations
Location not specified yetZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje Barth
Talk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm2 attendees
GPU, CUDA, and PyTorch Performance Optimizations
Location not specified yetZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje Barth
Talk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm2 attendees
GPU, CUDA, and PyTorch Performance Optimizations
Location not specified yetZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje Barth
Talk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm2 attendees
Past events
57