AI Evals: Real-World Examples & Practices


Details
AI Evals - Real World Examples & Practices
Talks & Speakers:
From Planning to Production: Scalable LLM Evaluations in Practice
Learn how to build an end-to-end evaluation workflow for LLMs - from designing and testing eval prompts on sample data, to integrating with CI/CD, and monitoring live production traffic.
Hilik Paz, CTO & Co-founder at Arato
Measuring the Unmeasurable: GenAI Evaluations at Scale
A practical guide to centralizing and scaling GenAI evaluations - making them traceable, accessible, and measurable for real-world use cases.
Almog Mor, Machine Learning Engineer at WSC sport
Where & When:
Tuesday, June 17th
18:00-21:00
WSC Sports Offices - 28th Floor
Abba Hillel Silver Rd 21, Ramat Gan
Who Should Attend:
- Engineers & developers building GenAI apps
- Engineering leads seeking to scale GenAI
- Prompt engineers and technical PMs
- ML/AI practitioners
- Anyone working on testing, measuring, or improving GenAI performance
Why Attend:
Evaluating GenAI behavior is a core part of building reliable and scalable AI systems. Whether you’re building agents, RAG pipelines, or LLM-powered tools, this meetup will help you rethink how you define success, measure quality, and build trust.
📝 Agenda:
18:00 - Gathering, food & drinks
18:30 - Talks begin
20:00 - Q&A with the speakers
See you there!


Sponsors
AI Evals: Real-World Examples & Practices