November C++ Meetup
Details
Hoi Zäme
Join us in yet another C++ meetup ! Happening at the tipi.build offices, sponsored by EngFlow + tipi.build in the Bluelion Incubator near Schiffbau, Josefstrasse 219 in Zurich.
Agenda
- 18:00 - 18:30 Welcome snacks and socializing
- 18.30 - 19.30 Jan Wassenberg, Gemma.cpp: A flexible framework for CPU-based LLM inference research
- 19.00 - 20.00 Socializing - Pizza 🍕 sponsored by EngFlow + tipi.build
About Speaker
Jan Wassenberg is a senior staff SWE at Google Deepmind. He is tech lead of the open-source gemma.cpp (R&D for LLM inference on CPUs), and of Highway, a widely-used SIMD abstraction layer.
Jan has a PhD in Computer Science (2011) and has worked on a wide range of performance-critical projects, including compression, video editing, and sorting. He was drawn to LLMs after realizing their potential for brainstorming and knowledge discovery, and was inspired to apply his passion for high-performance software and compression. He is happy to collaborate with the community to further advance LLM inference on CPUs.
Abstracts
Gemma.cpp: A flexible framework for CPU-based LLM inference research
Imagine LLM inference development without having to worry about memory capacity, vendor lock-in, hardware cost/availability, or writing custom kernels for each platform. The open source gemma.cpp makes this a reality. It enables prototyping of new algorithms and compression techniques without being limited by compiler, framework, or hardware quirks. Thanks to Highway, you can write a single implementation that runs on all major CPUs. You may also be surprised at the efficiency: it can outperform OneDNN and llama.cpp.
Besides a tour of gemma.cpp, the talk will feature a live demo of PaliGemma, an LLM capable of answering questions about images by combining visual and textual understanding. We conclude with Q&A and discussion, and invite the community to benefit from our testbed and collaborate on future directions.