The Art & Science of LLM Reliability: Building Trustworthy AI Systems

Name: The Art & Science of LLM Reliability: Building Trustworthy AI Systems
Start: 2025-01-28T18:00:00-05:00
End: 2025-01-28T19:30:00-05:00

Hosted By

Daniel Z.

The Art & Science of LLM Reliability: Building Trustworthy AI Systems

Details

LLMs are transforming industries, but their Achilles’ heel—hallucinations, accuracy, and bias—remains a barrier to trust. We’re thrilled to welcome Rush Shahani, CTO of [Persana.ai](https://persana.ai/) (YC W23) and author of the LLM Reliability book, to guide us through this essential topic. Rush will share insights from his work at Persana.AI, exploring how their innovative use of RAGAS, LLM-as-a-Judge, and RAG Agents paves the way for scalable and reliable AI applications.

Key Takeaways:

Solving LLM’s Weak Spots: Learn how to mitigate hallucinations and ensure trustworthy outputs using RAG techniques and real-time data.
Evaluation Done Right: Discover advanced tools like RAGAS and learn about the role of metrics like completeness and context adherence in production-ready systems.
Building Reliable AI: Master actionable strategies for prompting, monitoring, and optimizing models to align with business needs.

Why Attend:
- Gain practical insights to tackle reliability challenges in real-world AI applications.
- Collaborate with fellow GenAI practitioners, decision-makers, and cloud enthusiasts.
- Leave with strategies you can implement immediately—and a special discount for Rush’s book, LLM Reliability.

📅 Agenda:
6:00 PM - Welcome and introductions
6:10 PM - Tackling the Achilles’ Heel of LLMs presentation
7:00 PM - Q&A and open discussion
7:20 PM - Wrap-up

📚 Homework - Prepare to Engage:
- Watch the ACM Tech Talk Research to Reality: Building Production-Ready LLM Apps Users Can Trust
- Explore Manning Publications LLM Reliability book

Bring plenty of enthusiasm and questions—we’re looking forward to an engaging and fun-filled evening 🤓

Events in AI and Society Cloud Computing TDD

Artificial Intelligence Evaluation and Personal Feedback