Skip to content

AI QA Test Engineering: Testing AI Applications with the Power of AI

Photo of ShiftSync
Hosted By
ShiftSync
AI QA Test Engineering: Testing AI Applications with the Power of AI

Details

To participate, please complete your free registration here

### ๐—–๐—ฎ๐—ป ๐—ฒ๐˜ƒ๐—ฒ๐—ฟ๐˜† ๐—”๐—œ ๐—ผ๐˜‚๐˜๐—ฝ๐˜‚๐˜ ๐—ฏ๐—ฒ ๐˜๐—ฟ๐˜‚๐˜€๐˜๐—ฒ๐—ฑ?

During this event you'll explore methods for evaluating AI applications using three tools designed to protect you from generative mistakes.

1. Testing AI Applications with DeepEval
DeepEval is an open-source evaluation framework designed for structured and automated testing of AI outputs. It allows you to define custom metrics, set expectations, and benchmark responses from LLMs. In this session, we'll explore how QA engineers and developers can use DeepEval to test the quality, accuracy, and reliability of AI-generated responses across different use cases like chatbots, summarization, and code generation.

2. Testing AI Applications with LLM as Judge
LLM-as-a-Judge is a powerful technique where an AI model evaluates the outputs of another model. Instead of relying solely on manual review or static metrics, we'll learn how to use trusted LLMs (like GPT-4) to provide qualitative assessments-grading correctness, coherence, tone, or factuality. This method enables scalable and human-like evaluation in real-time AI testing pipelines.

3. Evaluating LLMs with Hugging Face Evaluate
Hugging Face's evaluate library offers a robust suite of prebuilt metrics and tools to measure the performance of LLMs and NLP models. This topic will cover how to integrate and use evaluate in your testing workflows to assess text generation, classification, translation, and more-using standardized metrics like BLEU, ROUGE, and accuracy, alongside custom metrics for GenAI applications.

A Q&A session with Karthik K.K. will follow. Prepare your questions!

Photo of ShiftSync Londonโ€“ QA & Software Testing Community group
ShiftSync Londonโ€“ QA & Software Testing Community
See more events
FREE