Discussion - Topic: LLM Guardrails


Details
This week's topic: LLM Guardrails
As described in Thoughtworks Technology Radar Vol. #31.
LLM Guardrails is a set of guidelines, policies or filters designed to prevent large language models (LLMs) from generating harmful, misleading or irrelevant content. The guardrails can also be used to safeguard LLM applications from malicious users attempting to misuse the system with techniques like input manipulation. They act as a safety net by setting boundaries for the model to process and generate content. There are some emerging frameworks in this space like NeMo Guardrails, Guardrails AI and Aporia Guardrails our teams have been finding useful. We recommend every LLM application have guardrails in place and that its rules and policies be continuously improved. Guardrails are crucial for building responsible and trustworthy LLM chat apps.
Zoom link will be added about 5 min before the event starts.
Discussion Resources :
How Good Are the LLM Guardrails on the Market? A Comparative Study on the Effectiveness of LLM Content Filtering Across Major GenAI Platforms By Yongzhe Huang, Nick Bray, Akshata Rao, Yang Ji, Wenjun Hu
https://unit42.paloaltonetworks.com/comparing-llm-guardrails-across-genai-platforms/
New Study Exposes Strengths and Gap of Cloud-Based LLM GuardrailsBy Mandvi
https://cyberpress.org/new-study-exposes-strengths-and-gap/
New Research Uncovers Strengths and Vulnerabilities in Cloud-Based LLM Guardrails By Aman Mishra
https://gbhackers.com/new-research-uncovers-strengths-and-vulnerabilities/
LLM Guardrails for Data Leakage, Prompt Injection, and More By Jeffrey Ip
https://www.confident-ai.com/blog/llm-guardrails-the-ultimate-guide-to-safeguard-llm-systems

Discussion - Topic: LLM Guardrails