This month's presenter is Jacob Bruce, a Canadian robotics student at QUT, who will be taking us through the historical and present-day successes of Deep Reinforcement Learning, and will provide tips on how to determine whether deep RL is the right approach for the task at hand.
Technical rating: 🌶️🌶️🌶️
Jacob tells us that his talk won't shy away from anything at the implementation level, so expect a high level of technicality there (e.g., keeping your gradients numerically stable by never taking the log of an exponential or letting a variance go near zero, etc).
However, the only equations that will appear in this talk are the ones that will end up becoming code – rest assured there will be a focus on presenting content that results in being able to build, debug, tune, and improve a working AI system.
6:00pm: Mingling, networking and nibbles
6:30pm: Deep Reinforcement Learning: Past, Present and Future
7:30pm: Group Question and Answer Session
8:00pm: Mingling continued
Title: Deep Reinforcement Learning: Past, Present and Future
Deep reinforcement learning is the latest hot topic in general-purpose AI. But deep nets and RL have both been around for decades and are actually quite straightforward; it's only the massive computational power that is new.
In this talk Jacob will describe some of the historical and recent successes of RL, the main paradigms of RL and why they work, and whether deep RL is the right tool for your problem. He will demonstrate a complete deep RL agent in about a dozen lines of python, and anyone with a baseline understanding of gradient descent should come away from the talk being able to implement their own.
Jacob Bruce is a Canadian robotics student at QUT, doing a PhD in robot learning for navigation. Before this, he completed a MSc in robotic vision, focusing on algorithms for robots finding tiny distant people. With the recent explosion of deep learning, he has been working on using those techniques to learn navigation behaviour end-to-end with real robots, and has just finished an internship at Google's DeepMind along these lines.