Lessons from evaluating production AI Agents for over a year
Details
We’re thrilled to invite you to an exciting meetup!
We'll have one talk presented by an Elastic employee and one presented by a community member. Bring your questions and come support the community!
Agenda
5:00 PM – Doors Open
Come in, say hello, grab some food, and settle in for an engaging evening.
5:45 PM – Talk #1:
6:15 PM – Q&A
6:30 PM – Talk #2:
7:00 PM – Q&A
7:15 PM – Networking & Event Wrap-Up
Stick around to chat, connect, and share insights with fellow attendees.
When:
Thursday, May 14th
5:00 - 7:00 pm
Where:
Centre for Social Innovation - Spadina
192 Spadina Ave. · Toronto, ON
3rd floor, The Viola Desmond Room
Abstracts
Lessons from evaluating production AI Agents for over a year
This talk covers methods of evaluating AI Agents, with an example of how the speakers built evaluation frameworks for a user-facing AI Agent system which has been in production for over a year. We share tools and frameworks used (as well as tradeoffs and alternatives), and discuss methods such as LLM-as-Judge, rules-based evaluations, ML metrics used, as well as selection tradeoffs.
Spots are limited, so don't miss out!
