About us
This meetup is focused on AI Performance Engineering.
Upcoming events
10

NVIDIA GTC 2026 Conf Recap + Evolution of Flash Attention v1-v4 Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje BarthTalk #1: NVIDIA GTC 2026 AI Conference Recap by Chris Fregly
In this talk, Chris will present the AI and systems highlights from the NVIDIA GTC 2026 conference (happening the prior week.)
Conference registration link:
https://www.nvidia.com/gtc/ (Use code GTC26-20 for 20% off!)Talk #2: Evolution and Deep Dive into Flash Attention (v1-v4) for Transformers on NVIDIA GPUs by Seth Weidman @ Sentilink and Author of "Deep Learning from Scratch" @ O'Reilly
In this talk, Seth will break down the evolution of Flash Attention, an optimized and mechanically-sympathetic implementation of the attention mechanism which is fundamental to a the Transformer architecture in modern LLMs.
Related links:
Blog: https://modal.com/blog/reverse-engineer-flash-attention-4
Github: https://github.com/Dao-AILab/flash-attention
Arxiv paper: https://arxiv.org/abs/2205.14135Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm4 attendees
GPU, CUDA, and PyTorch Performance Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje BarthTalk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm2 attendees
GPU, CUDA, and PyTorch Performance Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje BarthTalk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm1 attendee
GPU, CUDA, and PyTorch Performance Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje BarthTalk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm1 attendee
Past events
53
