About us
Join us for a variety of events on technical AI safety, governance in a world of advanced AI, and more.
Hosted by Trajectory Labs, a nonprofit coworking and events space catalyzing Toronto's role in steering AI progress toward a future of human flourishing.
Is there a topic you'd love to see us cover at a future event? Submit your suggestion here.
Upcoming events
3

Eliciting Harmful Capabilities by Fine-Tuning on Safeguarded Outputs
30 Adelaide East, Industrious Office 12th Floor Common Area, 30 Adelaide East, 12th Floor, Toronto, ON, CAThis is a ticketed event. Please register at this link.
In this talk, Talha Paracha will present insights from his latest research on using language models for improving software security ("Hallucinating Certificates", to appear at ICSE 2026).
Certificate validation is a crucial step in Transport Layer Security (TLS), the de facto standard network security protocol. Prior research has shown that differentially testing TLS implementations with synthetic certificates can reveal critical security issues, such as accidentally accepting untrusted certificates.
Paracha et al. introduce a new approach, MLCerts, to generate synthetic certificates that leverages generative language models to more extensively test software implementations. Recently, these models have become (in)famous for their applications in generating content, writing code, and conversing with users, as well as for "hallucinating" syntactically correct yet semantically nonsensical output. The authors leverage two novel insights in their work: (a) TLS certificates can be expressed in natural-like language, namely in the X.509 standard that aids human readability, and (b) differential testing can benefit from hallucinated malformed test cases. MLCerts finds significantly more distinct discrepancies between the five TLS implementations OpenSSL, LibreSSL, GnuTLS, MbedTLS, and MatrixSSL than the state-of-the-art benchmark Transcert.
Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open DiscussionsIf you can't make it in person, feel free to join the live stream starting at 6:30 pm, via this link.
9 attendees
Testing LLM Cooperation in Multi-Agent Simulation
30 Adelaide East, Industrious Office 12th Floor Common Area, 30 Adelaide East, 12th Floor, Toronto, ON, CAThis is a ticketed event. Please register at this link.
Ryan Faulkner explores various papers that address cooperation and safety in multi-agent LLM simulations. Some of the core topics will include:
- Moral behaviour of agents in high-stakes, zero-sum, and morally charged social dilemmas
- Governance and sanctioning dynamics and the ways that LLM agents often fail to cooperate and free-ride in common-pool resource game
- Mechanism design interventions like mediation, contracts, and elected leadership are explored to steer agents toward safer outcomes.
This research also reveals that LLMs adapt their behavior based on awareness of their conversational partner's identity.
Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open DiscussionsIf you can't make it in person, feel free to join the live stream starting at 6:30 pm, via this link.
61 attendees
The Lytic Threshold
30 Adelaide East, Industrious Office 12th Floor Common Area, 30 Adelaide East, 12th Floor, Toronto, ON, CAThis is a ticketed event. Please register at this link.
In this talk, Sheikh Abdur Raheem Ali:
- Discusses rare cases where deployed LLMs have been observed to engage in self-directed behavior which would have led to catastrophic outcomes if the agent escaping containment were equipped with stronger capabilities.
- Introduces arbitrium, a peptide-based communication method used by certain bacteriophages which release small molecules known as autoinducers to decide coordinated population-level behavior, and in particular lytic and lysogenic pathways, with a quorum-sensing mechanism.
- Explores landmark results from alignment science which inform our current understanding of LLM biology (<250 malicious documents required to poison training datasets vs <100 viral particles required to produce infection in humans).
- Analyzes early findings from experiments which attempt to investigate scalable improvements to defenses that monitor the internal activations of production transformer models (such as probes) and demonstrates how these enable targeted interventions on known distributions which are more difficult to achieve with input/output only methods (such as prompted classifiers) in certain cases (such as long context windows or latent reasoning models)
Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open DiscussionsIf you can't make it in person, feel free to join the live stream starting at 6:30 pm, via this link.
11 attendees
Past events
217


