Skip to content

[Guest] Domenic Rosati - actually defending against harmful finetuning

Photo of Giles
Hosted By
Giles and Mario G.
[Guest] Domenic Rosati - actually defending against harmful finetuning

Details

Getting here: Enter the lobby at 100 University Ave (right next to St Andrew subway station), and message Giles Edkins on the meetup app or call him on 647-823-4865 to be let up to room 6H.

We welcome back Domenic Rosati. Last time we heard about the problem of harmful finetuning, and a specification for what it might mean for a model to be resistant to it. Now Domenic's back with a new paper explaining how to actually defend against these attacks. Should be good news for anyone wanting to release a language model openly while locking down harmful capabilities.

May be somewhat technical!

We welcome a variety of backgrounds, opinions and experience levels.

Photo of Toronto AI Safety group
Toronto AI Safety
See more events

Every week on Thursday

100 University Ave
100 University Ave · Toronto, ON
Google map of the user's next upcoming event's location
FREE