Partner Event: Evaluating AI Agents
Details
Our partners at Aggregate Intellect are launching a free AI Agent Learning Series and our meetup community is going to get involved in supporting the initiative and in helping source guest speakers and suggesting new topics.
This Friday, Aggregate Intellect CEO, Amir Feizpour will host a session with Samuel Dion-Girardeau on building evaluation frameworks that actually work for subjective domains. We will learn about measuring AI in uncertain domains, going from accuracy to market relevance and how to use meta-evaluation and practical tools. Evaluating agent performance can be tricky. In this talk, the challenge of designing a solid evaluation framework is explored. Early benchmarks using F1 scores and MSE often punished good picks, so the approach was refined, leading to stronger evaluations and more confidence in the results.
Register here: https://maven.com/p/c7d631/evaluating-ai-agents?utm_medium=ll_share_link&utm_source=instructor#:~:text=Sign%20up-,for,-free
