Applied LLMs at Picnic: Warehouses, Notebooks & Evaluation Loops

Name: Applied LLMs at Picnic: Warehouses, Notebooks & Evaluation Loops
Start: 2025-08-20T17:30:00+02:00
End: 2025-08-20T20:30:00+02:00
Location: Picnic HQ

Hosted By

Picnic T.

Applied LLMs at Picnic: Warehouses, Notebooks & Evaluation Loops

Details

Join our upcoming meetup for all things LLMs!
We've planned an evening full of curated talks, great snacks, and even better company just for you:

Schedule
17:30 doors open, drinks, food
18:00 #1 Applying LLM's in our automated warehouse (Sven Arends - Picnic)
18:30 #2 Python notebooks are better now (Vincent D. Warmerdam - Marimo)
19:00 break, drinks
19:15 #3 Evaluation-Driven Development & Synthetic Data Flywheels (Hugo Bowne-Anderson - Independent Data and AI Scientist, Vanishing Gradients)
19:45 networking drinks
20:45 end

Talk 1: Applying LLM's in our automated warehouse (Sven Arends - Picnic)
We'll share how we're using multi modal LLM's in our automated warehouse. We'll discuss the challenges we've faced with the hardware, software, and GenAI components and we'll cover the practical aspects of GenAI deployment, including prompt optimization, preventing LLM "yapping," and creating a robust feedback loop for continuous improvement.

Talk 2: Python notebooks are better now (Vincent D. Warmerdam - Marimo)
This talk is about marimo, a Python notebook that completely rethinks how you might want to interact with code in a notebook. There's SQL cells, direct LLM support, stellar widgets and interactive dataframe tooling. But the most suprising thing about the notebook is that it even goes a step beyond by changing Python itself! The goal of this session is to explain all of this ... but ... this won't be a talk, it will be a live-coding session instead!

Talk 3: Evaluation-Driven Development & Synthetic Data Flywheels (Hugo Bowne-Anderson - Independent Data and AI Scientist, Vanishing Gradients)
Learn how evaluation-driven development can transform your LLM applications by helping you build a minimum viable evaluation framework (MVE), even before your product reaches real users. In this talk, you'll see how to generate synthetic queries from realistic personas, label outputs to define correctness and failure modes, and construct an evaluation harness to systematically compare models and prompts. The framework also seamlessly transitions into a robust evaluation approach once your app encounters real-world users, guiding iteration through structured analysis, and continuously tracking essential metrics such as accuracy, cost, and latency through lightweight observability.

See you all there! 👀

Events in Amsterdam AI Algorithms Machine Learning

Knowledge Sharing Education & Technology Amsterdam