Skip to content

Details

AI Native Netherlands is back! We'll be hosting our 12th edition at the AWS office.

Before anything else - thank you. This community has grown faster than we ever imagined, and it's been great to watch the connections be made, and the conversations getting better at every edition. The AI landscape keeps moving fast, and there's no better way to keep up than in a room full of people working through it alongside you.

We've got two amazing speakers lined up and a great venue for the evening — we'd love to see you there.

A huge thank you to our friends at AWS for hosting us at their Amsterdam office. Food and drinks will be provided!

We'll cover:

  • When to run a task on a CPU versus a GPU — and what that choice does to cost.
  • Using fine-tuned small models (SLMs) for classifying, routing, embedding, and reranking.
  • Building a real CPU-first agent on Amazon EKS that calls an LLM only when needed.
  • Measuring price/performance in practice — and the design assumptions that turned out to be wrong.

Speaker 1: Christian Melendez (AWS)
Christian Melendez is a Principal Specialist Solutions Architect at AWS. He helps the region's largest enterprises build efficient, resilient AI and cloud-native workloads on Kubernetes, with a focus on compute efficiency, cost optimisation, and autoscaling at scale. Author of Kubernetes Autoscaling and creator of Karpenter Blueprints, a best-practices repository that grew Karpenter adoption 16.5x across EMEA, he also built Slemify, an open-source framework for fine-tuning and serving Small Language Models on Kubernetes. A regular speaker at AWS re:Invent, KCDs / CNDs, ContainerDays, and others, Christian focuses on making infrastructure simple, observable, and cost-effective at scale.

Talk: The Right Tool for the Right Token: When to Reach for a CPU or a GPU
While hard reasoning problems require an LLM and a GPU, other tasks can be solved with simpler means. Classifying, routing, embedding, and reranking are structured, high-frequency jobs that a fine-tuned small model (SLM) can handle on CPU at predictable cost. In this session, we will demo a pipeline where each task runs on the best fitting infrastructure. This will include a real CPU-first agent on Amazon EKS running an SLM, and calling an LLM when needed. We will share our learnings from building this pipeline, including how we measured price/performance, its limitations, and which of our design assumptions we found to be wrong. You will leave with confidence to choose the right tool, and an open-source reference architecture to validate your assumptions.

Speaker 2: William Rizzo (Mirantis)
William Rizzo is Global Field CTO at Mirantis, where he helps organisations design, build, and run platform engineering, edge, and AI infrastructure initiatives. His career spans engineering, pre-sales, product ownership, and consulting across high-performance computing, storage, and distributed systems. A CNCF and Linkerd Ambassador and a Kairos maintainer, he's a regular speaker at KubeCon on platform engineering and building resilient internal developer platforms.

Talk: To be confirmed.

Agenda:
18:00 — Arrival, food & drinks
18:45 — Talk #1 | Christian Melendez (AWS)
19:30 — Short break
19:45 — Talk #2 | William Rizzo (Mirantis)
20:30 — Open conversation, networking & more drinks
21:00 — Wrapping up

What to bring:
Just curiosity and questions. If you're working on applied AI, MLOps, model serving, or the cost and infrastructure behind it, we'd love to hear how you're approaching it.

Who this is for:
Data scientists, AI/ML engineers, data engineers, MLOps specialists, SREs, platform and infrastructure engineers, architects, and engineering leaders focused on running real-world AI efficiently.

Where to find us:
AWS Office: Mr. Treublaan 7, 1097 JS Amsterdam

Related topics

Events in Amsterdam, NL
Artificial Intelligence
Technology Innovation
Education & Technology
New Technology
Software Development

You may also like