Name: NVIDIA Dynamo + Disaggregated Prefill-Decode LLM Serving + CUDA Optimizations
Start: 2025-09-15T18:00:00+02:00
End: 2025-09-15T19:00:00+02:00

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Talk #0: Introductions and Meetup Updates**
by Chris Fregly and Antje Barth

**Talk #1: NVIDIA Dynamo + Disaggregated Prefill-Decode LLM Serving by Chris Alexiuk @ NVIDIA**
NVIDIA Dynamo splits LLM serving into disaggregated prefill and decode stages, letting each scale independently for better throughput under latency constraints. We'll dive deep into how Dynamo does disaggregated serving in this session.

**Talk #2: High Performance CUDA Optimizations by Chris Fregly and Others**
CUDA Optimizations for high-performance AI.

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Related Links**
Github Repo: [http://github.com/cfregly/ai-performance-engineering/](http://github.com/cfregly/ai-performance-engineering/)
O'Reilly Book: [https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/](https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/)
YouTube: [https://www.youtube.com/@AIPerformanceEngineering](https://www.youtube.com/@AIPerformanceEngineering)
Generative AI Free Course on DeepLearning.ai: [https://bit.ly/gllm](https://bit.ly/gllm)

Chris Fregly

AI Performance Engineering Meetup (Paris)

Technology

Deep Learning

High Scalability Computing

Cloud Computing

Big Data

Machine Learning

Data Analytics

Predictive Analytics

Computer Programming

Data Science

Artificial Intelligence

PyTorch

CUDA: Compute Unified Device Architecture

Nvidia

Kubernetes

Michel Nguyen The

Karim Benahmed

mario moinet

William

Renat

Jean-Gérard Pailloncy

Imen

NVIDIA Dynamo + Disaggregated Prefill-Decode LLM Serving + CUDA Optimizations

Online event

Partager

AI Performance Engineering Meetup (Paris)

NVIDIA Dynamo + Disaggregated Prefill-Decode LLM Serving + CUDA Optimizations

AI Performance Engineering Meetup (Paris)

Détails

Vous aimerez peut-être aussi