cuTile and TileIR: The next step in GPU Programming
Details
After a long hiatus of the compiler social, Lorenzo is joining us to talk about the exciting work happening at NVIDIA that you have probably heard about, namely cuTile/TileIR.
As usual there will be pizza, snacks, beer, and soft drinks after the event for participants, sponsored by NVIDIA.
Astract: GPU programming has evolved significantly over the past decade, driven by rapid hardware innovation such as Tensor Cores and new numerical formats. However, the gap between high-level productivity frameworks and low-level performance-centric programming models continues to widen. In this talk, we introduce cuTile and TileIR, a new block-level programming model and intermediate representation designed to simplify high-performance GPU development while preserving forward compatibility with evolving NVIDIA architectures. cuTile provides a tile-centric abstraction for data-parallel workloads, accessible from Python, while TileIR—an MLIR-based low-level IR integrated with CUDA—offers a stable, portable foundation for targeting tensor cores and future hardware generations. Together, they establish a middle ground between usability and control, enabling expressive kernel development without sacrificing performance.
We present the programming model, illustrate it with examples, discuss performance considerations, and, if time permits, take a deeper dive into the core abstractions behind TileIR: https://github.com/NVIDIA/cuda-tile
Location: The event is taking place in the room G59 in the CAB building of ETH Zurich at the Zentrum campus. Enter from Universitätstrasse 6.
