Skip to content

Details

This is a ticketed event. Please register at this link.

​In his talk, Samuel Simko from ETH Zurich will present his recent work on adversarial defenses for LLMs, developed with the Jinesis Lab (University of Toronto). The talk will cover a series of approaches, ranging from triplet-based contrastive learning defenses to honeypot-style defenses designed to avoid worst-case behavior. He will also discuss patterns observed in contest-winning manual jailbreaking prompts, ideas for tamper-resistant safeguards, and the current limits of attacks, defenses, and evaluation methodologies.

Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open Discussions

​​​If you can't attend in person, join our live stream starting at 6:30 pm via this link.

Related topics

Events in Toronto, ON
Artificial Intelligence
Artificial Intelligence Applications
Artificial Intelligence Machine Learning Robotics
Machine Learning
Machine Learning Interpretability

You may also like