Skip to content

Details

Zoom link: https://us02web.zoom.us/j/82308186562

Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje Barth
New book on high-performance co-design of hardware (NVIDIA GPUs, software (PyTorch, vLLM), and algorithms coming next month! Pre-order now and be the first to experience the magic!!
https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Inference/dp/B0F47689K8

**Talk #1: High-Performance, AI-Powered GPU Kernel and System Optimization Engine by Mohammed Abdelfattah @ Mako.dev**
In this talk, we explain the high-level architecture for Mako's Automated GPU Kernel and System Optimization Engine, and we will present some case studies of advanced optimizations enabled by our approach.

Talk #2: Distributed, Large-Scale PyTorch Training with Hugging Face accelerate and nbdistributed on NVIDIA GPUs by Zachary Mueller @ HuggingFace and Scratch to Scale (Course on Large-Scale Training in the Modern World)
In this talk, Zachary will deliver part of his upcoming course called Scratch to Scale: Large-Scale Training in the Modern World using PyTorch, Hugging Face accelerate, and nbdistributed (notebook distributed) for massive, in-notebook training jobs. Enrol now! https://maven.com/walk-with-code/scratch-to-scale

Zoom link: https://us02web.zoom.us/j/82308186562

Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm

Members are also interested in