Do We Even Need Agentic Evals? - Live AMA with Hamza Tahir (CTO, ZenML)


Details
Date: Thursday, September 11
Time: 4:00 PM IST (Ireland) 4:00 PM UK (BST) / 11:00 AM ET 8:00 AM PT
Format: Live AMA on the Jentic Community Discord (questions asked directly in Discord; answered live by Hamza & Rod)
## What this session is about
The industry is split: some teams ship fast with “vibes,” others build rigorous evaluation stacks. In this AMA, we’ll cut through the noise and get practical about when evals matter, when monitoring and A/Bs are enough, and how to pick the right level of rigor for agents in production.
## Topics we’ll cover
- Clear definitions: research evals vs. engineering evals; offline vs. online; LLM-as-judge vs. human review
- When to use what: smoke tests, regression suites, monitoring, and A/B tests for agents
- Agentic specifics: long-running loops, tool use, stuckness, “silent” failures, and goal correctness
- Error analysis that ships: turning traces into actionable evals without boiling the ocean
- Goodharting & drift: avoiding metric gaming; keeping evals aligned to product KPIs
- Coding agents vs. enterprise ops: why HITL domains tolerate lighter evals—and where you can’t cut corners
- Starter kit: a minimal, sensible stack (dataset, judge, rubric, monitor, dashboard) you can adopt tomorrow
## Why attend
- You’re building agent workflows and need a measured, production-ready approach to quality
- You want to iterate faster without flying blind
- You’re deciding between eval tools, building your own, or relying on monitoring + A/B in prod
## Speakers
- Hamza Tahir - CTO, ZenML
- Rod Rivera - Host, Jentic Community
## How the AMA works
- Join the Jentic Community Discord (link in the registration confirmation).
- Drop your questions in the live AMA channel. We’ll answer them in real time.
- Can’t attend live? Register to get the recording.
## Prep (optional)
Provide one concrete scenario: your agent, the target outcome, and the current failure mode. We’ll map it to a minimal evaluation and monitoring plan during Q&A.
Cost: Free
Recording: Yes (shared with registrants)
Register now and bring your toughest “Do we even need evals?” question.

Do We Even Need Agentic Evals? - Live AMA with Hamza Tahir (CTO, ZenML)