Anomaly Detection and Eliciting Latent Knowledge
Details
Getting here: Enter the lobby at 100 University Ave (right next to St Andrew subway station), and message Giles Edkins on the meetup app or call him on 647-823-4865 to be let up to room 6H.
How do we tell when an AI system's true knowledge of a situation is not as it appears? It might be actively deceiving us (deceptive alignment) or it might be able to predict that some third-party adversary is messing with us. The "Eliciting Latent Knowledge" (ELK) research agenda seeks to define what we mean by this, and how such knowledge can be extracted. This presentation will cover ELK as applied to mechanistic anomaly detection.
This topic is somewhat theoretical and future-looking. It may be technical in places but it shouldn't be too mathy.
We welcome a variety of backgrounds, opinions and experience levels.
Every week on Thursday
Anomaly Detection and Eliciting Latent Knowledge