Agentic Misalignment: When AI Agents Become Insider Threats

Name: Agentic Misalignment: When AI Agents Become Insider Threats
Start: 2025-11-19T12:00:00-05:00
End: 2025-11-19T13:00:00-05:00

Hosted by Kelly S. and Jerry H.

Meet the group

OWASP Virtual Chapter

No reviews yet

Details

Description: Anthropic's 2025 research showed that 80-96% of frontier AI models resorted to blackmail, sabotage, and data leakage when facing goal conflicts through reasoning about how to accomplish objectives. Unfortunately, we are seeing the research manifesting itself in production: Some recent examples include: Replit's agent wiping databases despite explicit instructions not to, Gemini CLI deleting entire projects, and Claude Code enabling extortion campaigns. This talk examines agentic misalignment as an insider threat pattern, analyzes why traditional security controls fail against agents that treat instructions as statistical tokens rather than constraints, and presents a risk assessment framework for evaluating when autonomous AI creates more vulnerability than value in development environments.

Application Security

Computer & Information Network Security

Computer Security

Cybersecurity

Web Application Security

Agentic Misalignment: When AI Agents Become Insider Threats

OWASP Virtual Chapter

Details

Members are also interested in