Human-level control through deep reinforcement learning


Details
The Paper:
https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf
The Hook:
If you’re someone like me who regularly attends the local Papers We Love meetup, then you’ve probably heard something about the latest breakthroughs in reinforcement learning such as:
Where a bipedal agent learns to not only walk with a simple reward function, but also navigate obstacles:
https://arxiv.org/abs/1707.02286
https://www.youtube.com/watch?v=hx_bgoTF7bs
Where an agent learns to play go with no human teaching, simply by playing itself:
https://deepmind.com/blog/alphago-zero-learning-scratch/
The Talk:
These breakthroughs are the result of advancements in Deep RL, and one of the seminal papers on this subject is Mnih et al 2015. This is the paper discussing the work done at Google Deep Mind to learn how to play various Atari games by only using the pixels on the screen as input, and the game score as the reward function.
Before we dive into all the shiny new gadgetry, we’ll look at a bit of the fundamental math and history in RL including the Bellman Equation, and some fundamental algorithms such as policy iteration, value iteration, and q-learning.
We will also be talking about the OpenAI Gym framework. This Python framework contains many RL problems allowing you to easily test your algorithms against several problems.
There will be food sponsored by Slalom consulting.

Human-level control through deep reinforcement learning