
What we’re about
We want to bring people together who are interested in AI and Machine Learning. At our meetups, we have:
- Networking
- Talks
- Fireside chats
- Knowledge exchange
- Applications of AI and Machine Learning
We organize our meetups every other month and due to current restrictions, we can only host virtual events.
We are always looking for innovative and inspiring speakers. If you know somebody who would be an excellent fit for our meetup, we would highly appreciate if you help us and recommend this speaker to us. To recommend a speaker for CAIML, please fill out this form.
Learn more about the organizers:
Upcoming events (4+)
See all- CAIML #38lise GmbH, Köln
CAIML #38 is going to happen on September 9, 2025, at lise GmbH.
Agenda
18h30 Open Doors
19h00 Welcome & Intro
19h15 Tomaz Bratanic (Graph ML and GenAI research at Neo4j): Agentic GraphRAG with MCP serversThis talk explores design patterns for integrating graph memory into agentic workflows, discusses trade-offs between retrieval accuracy and computational efficiency, and highlights how persistent knowledge unlocks more capable, personalized, and trustworthy AI systems.
- 5 Minute Break -
19h50 Pablo Iyu Guerrero (AI Inference Engineer at Aleph Alpha) and Lukas Blübaum (AI Engineer at Aleph Alpha): Tokenizer-free language model inference
Traditional Large Language Models rely heavily on large, predefined tokenizers (e.g., 128k+ vocabularies), introducing limitations in handling diverse character sets, rare words, and dynamic linguistic structures. This talk presents a different approach to language model inference that eliminates the need for conventional large-vocabulary tokenizers. The system operates with a core vocabulary of only 256 byte values, processing text at the most fundamental level. It employs a three-part architecture: byte-level encoder and decoder models handle character sequence processing, while a larger latent transformer operates on higher-level representations. The interface between these stages involves dynamically creating "patch embeddings", guided by word boundaries or entropy measures. This talk will first introduce the intricacies of this byte-to-patch transformer architecture. Subsequently, we will focus on the significant engineering challenges encountered in building an efficient inference pipeline, specifically coordinating the three models, managing their CUDA graphs, and handling their respective KV caches.
20h20 Networking with food and drinks provided by lise GmbH
⚠️ Please note: ⚠️ All attendees are additionally required to register here. lise GmbH mandates that every attendee must be registered in order to participate in the event.
See you in September 🤖