Skip to content

AI Agent Evaluation: Practical guide to benchmark and improve AI Agents

Photo of Mohammad Arshad
Hosted By
Mohammad A. and Patricia M.
AI Agent Evaluation: Practical guide to benchmark and improve AI Agents

Details

Happening Tomorrow!

AI Agent Evaluation: Practical Guide to Benchmark & Improve AI Agents
📅 September 16 | 🕢 7:30 PM GST
📌https://nas.io/artificialintelligence/events/ai-agent-evaluation

AI agents look magical in demos—but often fail in the real world, eroding trust (remember Air Canada’s chatbot error or Google Bard’s costly slip?).
This talk introduces a practical playbook for evaluating agents, covering:
✅ Frameworks like RAGAS & TruLens
✅ Fresh ideas like Evaluation-Driven Development
✅ The path to an AI Quality Movement—where agents aren’t just impressive, but truly reliable.

Photo of DubAI and Data Professional group
DubAI and Data Professional
See more events
Needs a location