Running Evaluation Metrics With Different LLMS + Q/A

Name: Running Evaluation Metrics With Different LLMS + Q/A
Start: 2025-08-26T19:30:00+04:00
End: 2025-08-26T21:30:00+04:00

Hosted By

Mohammad A. and Patricia M.

Running Evaluation Metrics With Different LLMS + Q/A

Details

In this beginner-friendly 40-minute workshop, you’ll learn a simple, repeatable way to evaluate Q&A answers from different LLMs using a tiny dataset and two complementary approaches: basic automatic scores (Exact Match/F1) and an “LLM-as-Judge” rubric for Correctness, Faithfulness, Relevance, and Conciseness. We’ll show how to compare models fairly (same prompt/settings, temperature=0, consistent context), interpret results, and turn findings into actions using a light Analyze → Measure → Open Coding → Axial Coding loop. You’ll leave with a plug-and-play rubric, a mini dataset template, and a beginner notebook that generates a clear side-by-side report—so you can pick the right model with confidence and iterate quickly. + AI Residency Q/A Add the DDS Google calendar link so that you don't miss any events

AI Unlocked: Trends, Talent & Opportunities

Ready to dive into the future of AI?
Join us for a high-energy session exploring the latest breakthroughs, real-world use cases, and how YOU can ride the AI wave—whether you’re a student, pro, or just curious.

Date: July 22, Tuesday
👉 Register here: https://nas.io/artificialintelligence/events/evaluating-rag-systems-for-accuracy-trust-and-impact

✨ What’s Inside:
🔹 Hottest trends shaping AI today
🔹 How industries are using AI to win
🔹 Behind the scenes of our exclusive AI Residency Program
🔹 How to join the AI Challenge (and win big!)
🔹 Live Q&A to get your questions answered

This is your gateway to becoming part of the next-gen AI revolution.

Don’t miss out!

Events in Artificial Intelligence Innovation

Machine Learning Courses and Workshops Business Intelligence

MENA AI and Data Community

See more events

MENA AI and Data Community

Every week on Tuesday until October 28, 2025

Online event

This event has passed

MENA AI and Data Community

public group

Running Evaluation Metrics With Different LLMS + Q/A