Scaling evaluation systems for agentic platforms from prototype to prod
183 attendees from 40 groups hosting
Details
Please, register for the event here
This webinar looks to develop strategies for mastering cost-effective AI agent evaluation at scale using stratified sampling, risk-based testing, and multi-stage screening to cut costs while maintaining enterprise-grade quality control.
This webinar will share key insights on AI agent evaluation and highlight how to:
- Cut evaluation costs through stratified sampling across personas and risk-based prioritisation, focusing resources on high-impact, high-variance scenarios rather than exhaustive testing.
- Build progressive evaluation systems using multi-stage screening that catches obvious issues cheaply, then applies deeper evaluation only where needed - balancing "vibes" for content vs rigorous testing for critical processes.
- Implement enterprise-ready observability with OpenTelemetry telemetry, persona coverage maps, and continuous calibration frameworks that adapt as your agentic platform scales and models evolve
After a 30-minute talk there’ll be a 15-minute Q&A, for which we encourage you to submit questions in advance.
A webinar recording and related materials will be shared with all attendees after the event.
Disclaimer: We use Zoom for the webinar stream. The link to the webinar will be sent upon your registration to your email.
____________
Speaker:
Glyn Darkin - Global Head of Delivery @ ClearRoute
Having worked in roles ranging from cold caller, marketer and lead engineer in several start-ups to Chief Architect in a Global SI, Glyn Darkin’s jack-of-all-trades aptitude and engineer’s mindset is a winning combination in his role as Global Head of Delivery. Quick to grasp the most convoluted business issue, he has an impressive track record of using technology to solve client problems. His past achievements include developing an award-winning digital mortgage product for a leading bank, and building TescoEntertainment
Fascinated by AI, Glyn’s current obsession is staying ahead of the sweeping changes it brings. At home, he spends his evenings coding to fully understand how the technology will affect not only ClearRoute’s ways of working, but the types of products and platforms we implement for our customers.

