Skip to content

Details

Join Us for the next Israel LLVM Meetup featuring talks by Apple and Majestic Labs AI!

We’re thrilled to invite you to an exciting evening sponsored by Apple, where we’ll dive deep into cutting-edge compiler technologies and hardware-aware optimizations. This meetup brings together the LLVM community for insightful talks, networking, and discussions on the future of compiler infrastructure.

Expect a diverse lineup of sessions covering Triton, MLIR, and LLVM optimizations powered by hardware metrics — perfect for compiler developers, performance engineers, and anyone passionate about high-performance computing.

📅 Date & Time
February 18th, 2026
17:00 – Arrival & Networking
17:30 – Talks Begin
19:00 – Wrap-up

📍 Venue
Jem’s Hertzliya

🎤 Speakers & Topics

• Jonathan Cohen - Compiler Engineering Manager - Apple
Title: “Supercharging Compiler Optimization Remarks with Hardware Metrics”

Abstract: This session demonstrates how to turn hardware counters with compiler optimization remarks into performance wins. It combines performance analysis metrics with enriched Optimization Remarks to debug and mitigate performance bottlenecks - a challenging task for compiler developers or performance engineers.

• Michael Zuckerman - AI SW Engineering Manager – Majestic Labs AI
Title: “Triton for RISC-V: Bridging PyTorch to RISCV with MLIR & LLVM”

Abstract: Triton is a powerful open-source programming model originally developed to generate highly optimized GPU kernels for deep learning workloads. In this talk, I present an extension of Triton to target the RISC-V vector architecture, using MLIR and the LLVM toolchain as the compilation backbone. This work creates a practical bridge between modern PyTorch-based machine learning frameworks and emerging RISC-V hardware platforms, enabling efficient and portable execution of AI workloads beyond traditional GPUs.
The proposed integration allows PyTorch models to be lowered end-to-end into RISC-V vector instructions, leveraging open compiler infrastructure to significantly reduce the effort required to enable new hardware targets. I will show how Triton kernels, TorchDynamo graph capture, TorchInductor scheduling, and MLIR/LLVM code generation were aligned into a unified backend that supports rapid bring-up, iterative optimization, and transparent performance tuning.
This talk demonstrates how an open, modular compiler stack can accelerate the adoption of flexible hardware architectures in machine learning, while preserving high performance and developer productivity.

Related topics

Events in Herzliya, IL
High Performance Computing
C & C++
Compilers

You may also like