Skip to content

Details

Zoom link: https://us02web.zoom.us/j/82308186562

Talk #0: Introductions and Meetup Updates [5 mins]
by Chris Fregly and Antje Barth

Talk #1: OpenClaw/MoltBot/ClawdBot/MCP for GPU Kernel and AI System Optimizations [10 mins] by Chris Fregly

In this demo, Chris will demonstrate how to use both MCP and OpenClaw (formerly all the other names used in the talk title!) to optimize GPU kernels and complete end-to-end AI systems.

Related Link: Github Repo for these tools: https://github.com/cfregly/ai-performance-engineering/

Talk #2: Unlocking NVFP4: Low Precision Numerics on NVIDIA Blackwell [30-45 mins] by Riccardo Mereu @ Verda.com

In this talk, Riccardo dives deep into low-precision numerics, model quantization algorithms, on NVIDIA Blackwell including NVFP4 (vs. MXFP4), per-block scaling, per-tensor scaling, mixed precision, and much more!

Related link: GPU Mode NVFP4 GEMM kernel competition blog post by Daniel Obolensky: https://obolensky.xyz/blog/nvfp4_gemm_kernel_explanation/

Zoom link: https://us02web.zoom.us/j/82308186562

Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm

AI summary

By Meetup

Online talk on GPU, CUDA, and PyTorch performance optimizations for AI developers; learn practical techniques to boost model training speed.

You may also like