Name: NVIDIA GTC 2026 Conf Recap + Evolution of Flash Attention v1-v4 Optimizations
Start: 2026-03-23T12:00:00-04:00
End: 2026-03-23T13:00:00-04:00

Online meetup for AI engineers; learn NVIDIA GTC highlights and Flash Attention v1–v4 optimizations for Transformers on NVIDIA GPUs.

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Talk #0: Introductions and Meetup Updates**
by Chris Fregly and Antje Barth

**Talk #1: NVIDIA GTC 2026 AI Conference Recap** by Chris Fregly

In this talk, Chris will present the AI and systems highlights from the NVIDIA GTC 2026 conference (happening the prior week.)

**Conference registration link:**
[https://www.nvidia.com/gtc/](https://www.nvidia.com/gtc/) (Use code GTC26-20 for 20% off!)

**Talk #2: Evolution and Deep Dive into Flash Attention (v1-v4) for Transformers on NVIDIA GPUs** by Seth Weidman @ Sentilink and Author of "Deep Learning from Scratch" @ O'Reilly

In this talk, Seth will break down the evolution of Flash Attention, an optimized and mechanically-sympathetic implementation of the attention mechanism which is fundamental to a the Transformer architecture in modern LLMs.

**Related links:**

**Blog**: [https://modal.com/blog/reverse-engineer-flash-attention-4](https://modal.com/blog/reverse-engineer-flash-attention-4)
**Github:** [https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)
**Arxiv paper:** [https://arxiv.org/abs/2205.14135](https://arxiv.org/abs/2205.14135)

**Zoom link**: [https://us02web.zoom.us/j/82308186562](https://us02web.zoom.us/j/82308186562)

**Related Links**
Github Repo: [http://github.com/cfregly/ai-performance-engineering/](http://github.com/cfregly/ai-performance-engineering/)
O'Reilly Book: [https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/](https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/)
YouTube: [https://www.youtube.com/@AIPerformanceEngineering](https://www.youtube.com/@AIPerformanceEngineering)
Generative AI Free Course on DeepLearning.ai: [https://bit.ly/gllm](https://bit.ly/gllm)

Chris Fregly

AI Performance Engineering Meetup (Atlanta)

Technology

High Scalability Computing

Big Data

Data Science

Kubernetes

Machine Learning

Predictive Analytics

Deep Learning

Neural Networks

Artificial Intelligence

Natural Language Processing

PyTorch

TensorFlow

Every 3rd Monday of the month until December 31, 2026

Sarah Howard

Mark Zeng

Nick Nguyen

Geoff Lunsford

Jan Rubio

Greg

Greg Haygood

NVIDIA GTC 2026 Conf Recap + Evolution of Flash Attention v1-v4 Optimizations

Online event

AI summary

Share

AI Performance Engineering Meetup (Atlanta)

NVIDIA GTC 2026 Conf Recap + Evolution of Flash Attention v1-v4 Optimizations

AI Performance Engineering Meetup (Atlanta)

Details

AI summary

AI summary

You may also like