AI Safety Thursday: Attempts and Successes of LLMs Persuading on Harmful Topics

Details
Registration Instructions
This is a paid event ($5 general admission, free for students & job seekers) with limited tickets - you must RSVP on Luma to secure your spot.
If you can't make it in person, feel free to join the live stream starting at 6:30 pm, via this link.
Description
Large Language Models can persuade people at unprecedented scale—but how effectively, and are they willing to try persuading us toward harmful ideas?
In this talk, Matthew Kowal and Jasper Timm will present findings showing that LLMs can shift beliefs toward conspiracy theories as effectively as they debunk them, and that many models are willing to attempt harmful persuasion on dangerous topics.
Event Schedule
6:00 to 6:30 - Food & Networking
6:30 to 7:30 - Main Presentation & Questions
7:30 to 9:00 - Breakout Discussions

AI Safety Thursday: Attempts and Successes of LLMs Persuading on Harmful Topics