Evaluating the Performance of Large Language Models

Name: Evaluating the Performance of Large Language Models
Start: 2024-10-29T18:00:00-04:00
End: 2024-10-29T20:30:00-04:00
Location: W. R. Berkley

Hosted By

Katia M.

Evaluating the Performance of Large Language Models

Details

In the rapidly evolving field of Natural Language Processing (NLP), Large Language Models (LLMs) have demonstrated remarkable capabilities in generating human-like text and answering questions. However, evaluating the performance of these models and ensuring their effective deployment in production environments pose significant challenges. This talk will delve into the intricacies of LLM evaluation, focusing on key LLM based metrics for assessing the truth and quality of generated text. We will explore various evaluation techniques, including G-Eval, SelfCheckGPT, and QAG scores. Additionally, we will address the pitfalls of statistical scorers such as BLEU, ROUGE and METEOR and explain that for the most part they suck compared to model-based scorers.

Our Speaker:
Paul Arsenovic serves as the Associate Director of Data Science within the CIQ Solutions Data Science Group at S&P Global Market Intelligence. He specializes in developing and deploying AI-powered applications tailored for the CIQ Desktop platform. With a robust skill set in Python and expertise in building NLP applications, Paul excels at creating data-linking solutions and managing large-scale document processing. His proficiency extends to implementing automated infrastructure monitoring and signal processing, ensuring that the applications he oversees are both innovative and reliable. In his spare time you may see him gliding through the water on his hydrofoil or kiteboard in Chesapeake Bay.

Events in Glen Allen, VA Deep Learning

Artificial Intelligence Machine Learning Data Science

Richmond Data Science Community

See more events

Richmond Data Science Community

public group

Tuesday, October 29, 2024 at 6:00 PM to Tuesday, October 29, 2024 at 8:30 PM EDT

W. R. Berkley

4820 Lake Brook Drive, Suite 250 (2nd floor), Jamestown Conference room · Glen Allen, VA

Richmond Data Science Community

public group

Evaluating the Performance of Large Language Models