Name: AI Safety Thursdays: Agentic Misalignment: How LLMs could be insider threats
Start: 2025-07-24T18:00:00-04:00
End: 2025-07-24T21:00:00-04:00
Location: 30 Adelaide East, Industrious Office 12th Floor Common Area

Can AI agents misbehave while carrying out actions autonomously? At this event, [Giles Edkins](https://ca.linkedin.com/in/giles-edkins) will guide us through a look at and critique some research by Anthropic that demonstrates blackmail and other phenomena when an agent is threatened with shutdown or reprogramming.

​​​​**Event Schedule**
6:00 to 6:30 - Food & Networking
6:30 to 7:30 - Main Presentation & Questions
7:30 to 8:00 - Discussion

If you can't make it in person, feel free to join the live stream at 6:30 pm, via [this link](https://www.youtube.com/@Trajectory-Labs/live).

Juliana Eberschlag

Mario Gibney

Toronto AI Safety

Technology

Risk Management

New Technology

Safety

Critical Thinking

Artificial Intelligence Applications

AI and Society

Mathematics

Artificial Intelligence Machine Learning Robotics

Artificial Intelligence

Machine Learning

Software Engineering

Machine Learning Interpretability

Deep Learning

AI Safety Thursdays: Agentic Misalignment: How LLMs could be insider threats

30 Adelaide East, Industrious Office 12th Floor Common Area

Share

Toronto AI Safety

AI Safety Thursdays: Agentic Misalignment: How LLMs could be insider threats

Toronto AI Safety

Details

Related topics

You may also like