Deep RL: Trust Region Policy Optimisation

Canberra Deep Learning Meetup
Canberra Deep Learning Meetup
Public group

C&Ma House

22 Napier Cl · Deakin

How to find us

We are on the upper level (Level 1) with the Trellis Data sign

Location image of event venue

What we'll do

As many of you have in experience, if the update step is too big, the model does not learn effectively. Trust Region Policy Optimisation deals with the question how small is the gradient trustworthy in the neighbourhood of the parameter values at the current step.

Paper:
https://arxiv.org/abs/1502.05477

TRPO (Trust Region Policy Optimisation):
Part 1
https://medium.com/@jonathan_hui/rl-trust-region-policy-optimization-trpo-explained-a6ee04eeeee9

Part 2
https://medium.com/@jonathan_hui/rl-trust-region-policy-optimization-trpo-part-2-f51e3b2e373a

The math behind TRPO, PPO, and Natural Gradient method:
https://medium.com/@jonathan_hui/rl-the-math-behind-trpo-ppo-d12f6c745f33

Recommend to bring:
Laptop (for following the discussion about the paper)

This event is sponsored by Trellis Data (http://www.trellisdata.com.au).