**tl;dr: awesome researchers at OpenAI and DeepMind showed how you can teach an AI how to do a backflip, only saying things like 'yep, that's a kind of a backflip' or 'hum, no, that's not it'**
For this Meetup we will discuss stuff around "Learning from Human preferences", and we're lucky to be hosted at the ENS campus, with Lennart as co-organizer.
There is no need to actually read the papers. You can skim through the key points of the blogpost, or just come as you are.
Different levels of understanding before arriving at the Meetup:
- Easy) come as you are 😊
- Medium) read the blogpost by open AI: https://blog.openai.com/deep-reinforcement-learning-from-human-preferences/ or listen to the podcase with Jan Leike that talks about the paper https://80000hours.org/podcast/episodes/jan-leike-ml-alignment/
- Hard) read the original paper https://arxiv.org/abs/1706.03741
- Fresh) read the related latest NeurIPS paper https://arxiv.org/pdf/1811.06521.pdf
We will meet at the main entrance, 48 Boulevard Jourdan, at 18h. If you arrive later, call[masked] so we can let you into the Campus.
See you on Saturday,
Michaël and the Paris AI Safety team