STACK Meetup - Building AI Responsibly Through Guardrails and Interpretability


Details
ANNOUNCEMENT – Registration via this link only.
- Seats are on a first-come, first-served basis.
- For admission into our event space, please register beforehand via FormSG only.
- You will receive a confirmation email upon completing our registration form.
About this Meetup
As AI systems grow more advanced, ensuring their safety and predictability becomes increasingly critical. This STACK Meetup explores how safety testing and guardrails, and mechanistic interpretability, can reduce misinformation and bias. These approaches work together to ensure that AI functions safely and as intended, especially in high-stakes settings.
Get tips from our GovTech’s AI Practice team on safeguarding LLM applications against safety risks. Our speaker will guide you through the Responsible AI journey through the steps of defining a customised safety risk taxonomy, evaluating safety risks, and implementing safeguards to mitigate them.
Also, hear from a researcher at the Singapore AI Safety Institute on mechanistic interpretability, an approach akin to a brain scan for AI systems. This field seeks to uncover the inner workings of AI systems to identify backdoors, misalignment and unintended behaviours. This understanding powers applications such as model editing, behaviour steering, and the design of more robust guardrails, helping ensure that AI operates predictably and can be audited effectively.
Join us today. Seats are limited so sign up now!
Programme
7:00pm: Introduction
By STACK Community by GovTech
7:05pm: Introduction on Lorong AI
By Lorong AI
7:10pm: Safeguarding LLM Applications with Testing and Guardrails
By Goh Jia Yi, AI Engineer (Responsible AI), AI Practice, GovTech
7:45pm: Mechanistic Interpretability: Understanding Models From the Inside Out
By Clement Neo, Research Engineer, Singapore AI Safety Institute and Lab Advisor, Apart Research
8:15pm: Q&A
8:30pm: End of STACK Meetup
Click here* to sign up!
*Registration will be accepted via FormSG only.

STACK Meetup - Building AI Responsibly Through Guardrails and Interpretability