Name: Nvidia Nsight GPU Profiling +KV Cache Efficiency +Context "Platform" Engineering
Start: 2026-01-19T12:00:00-05:00
End: 2026-01-19T13:00:00-05:00

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Talk #0: Introductions and Meetup Updates**
by Chris Fregly and Antje Barth
*Best Selling O'Reilly book, "AI Systems Performance Engineering" is now available (eBook and physical!), 1000 pages, 200 figures, 700 examples!!!*

Amazon: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/

GitHub: https://github.com/cfregly/ai-performance-engineering

**Talk #1: Diving deep into NVIDIA Nsight Systems GPU profiling tools for PyTorch LLM and computer vision workloads by Chaim Rand**
In this talk, Chaim Rand (repeat speaker on this webinar series!) revisits the NVIDIA Nsight profiling tools to augment the PyTorch Profiler for LLM and vision workloads. This talk is based on Chaim's recent blog posts on Optimizing Data Transfer in AI/ML Workloads [part 1](https://chaimrand.medium.com/optimizing-data-transfer-in-ai-ml-workloads-60df62fe1278) and [part 2](https://chaimrand.medium.com/optimizing-data-transfer-in-batched-ai-ml-inference-workloads-a9f4165208b8).

**Talk #2: KV Cache Efficiency + Context "Platform" Engineering by Valentin Bercovici and Callan Fox (WekaIO)**
This presentation will include demos and code with a focus on improving KV-cache hit rates as well as introducing a methodology called Context "Platform" Engineering to design and optimize AI infrastructure for Agent Swarm Context at scale. Context Platform Engineering was recantly featured in the CES2026 keynote by Jensen Huang, CEO of NVIDIA. This presentation is related to a recent AIE CODE Summit [talk](https://www.startuphub.ai/ai-news/ai-video/2025/maximizing-kv-cache-hit-rates-wekas-open-source-context-platform-engineering/) in December 2025.

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Related Links**
Github Repo: [http://github.com/cfregly/ai-performance-engineering/](http://github.com/cfregly/ai-performance-engineering/)
O'Reilly Book: [https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/](https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/)
YouTube: [https://www.youtube.com/@AIPerformanceEngineering](https://www.youtube.com/@AIPerformanceEngineering)
Generative AI Free Course on DeepLearning.ai: [https://bit.ly/gllm](https://bit.ly/gllm)

Chris Fregly

AI Performance Engineering Meetup (Washington DC 2)

Technology

PyTorch

Kubernetes

CUDA: Compute Unified Device Architecture

Machine Learning

Artificial Intelligence

Big Data

Data Science

TensorFlow

Python

Every 3rd Monday of the month until December 31, 2026

Natalie Olivo

Ember

Grace Barker

Raul Chong

Shehzad Bashir

Michael April

Jay Kumar

Nvidia Nsight GPU Profiling +KV Cache Efficiency +Context "Platform" Engineering

Online event

Share

AI Performance Engineering Meetup (Washington DC 2)

Nvidia Nsight GPU Profiling +KV Cache Efficiency +Context "Platform" Engineering

AI Performance Engineering Meetup (Washington DC 2)

Details

You may also like