Building Local LLM Applications with llama.cpp


Details
Marios Aspris will talk about how to take a practical, application-driven approach to leveraging llama.cpp in modern C++ projects. We will cover:
- A high-level introduction to LLMs, focusing on practical applications rather than mathematical theory.
- Where to find open-source LLM models and how to convert them to the GGUF format for use with llama.cpp.
- How to integrate llama.cpp into your C++ projects using #include "llama.h", with live demonstrations of running inference on a laptop.
- An overview of Retrieval Augmented Generation (RAG) systems from an application perspective—what they are, the benefits for custom data, and how to build a simple RAG pipeline with llama.cpp as the inference engine.
By the end of the session, you'll understand how to use llama.cpp to build efficient, private, and customizable LLM-powered applications in modern C++, and how to set up a RAG system for your own data and business needs.
Explore the library: https://github.com/ggml-org/llama.cpp
🎤 About the speaker 🎤
Marios is a senior C++ developer with expertise in modern C++ standards, networking, and embedded Linux environments.
He has hands-on experience designing and integrating machine learning systems into product platforms. Outside of work, Marios enjoys tackling hobby projects that are both fun and educational.
📍 Location 📍
- Online via Zoom
⏰ Date & Time ⏰
- 17th of June 2025, 19:00 (Athens time)
🇬🇷 Language 🇬🇧
- Greek unless there are non-Greek speakers in the audience.
🍀 JetBrains license ruffle 🍀
- At the end of the event we will ruffle out 1 yearly license for a JetBrains IDE
![Photo of [GRCCP] - Athens C++ Meetup group](https://secure-content.meetupstatic.com/images/classic-events/492130825/56x56.jpg?w=56?w=128)
Building Local LLM Applications with llama.cpp