Name: Adversarial Defenses for LLMs
Start: 2026-03-26T18:00:00-04:00
End: 2026-03-26T21:00:00-04:00
Location: 30 Adelaide East, Industrious Office 12th Floor Common Area

This is a ticketed event. Please register at [this link](https://luma.com/e69uao09).

​In his talk, Samuel Simko from ETH Zurich will present his recent work on adversarial defenses for LLMs, developed with the Jinesis Lab (University of Toronto). The talk will cover a series of approaches, ranging from triplet-based contrastive learning defenses to honeypot-style defenses designed to avoid worst-case behavior. He will also discuss patterns observed in contest-winning manual jailbreaking prompts, ideas for tamper-resistant safeguards, and the current limits of attacks, defenses, and evaluation methodologies.

​**Event Schedule**
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open Discussions

​​​If you can't attend in person, join our live stream starting at 6:30 pm via [this link](https://www.youtube.com/@Trajectory-Labs/live).

Georgia Berg

Toronto AI Safety

Technology

Risk Management

New Technology

Safety

Critical Thinking

Artificial Intelligence Applications

AI and Society

Mathematics

Artificial Intelligence Machine Learning Robotics

Artificial Intelligence

Machine Learning

Software Engineering

Machine Learning Interpretability

Deep Learning

Adversarial Defenses for LLMs

30 Adelaide East, Industrious Office 12th Floor Common Area

Share

Toronto AI Safety

Adversarial Defenses for LLMs

Toronto AI Safety

Details

Related topics

You may also like