Skip to content

Deep RL: Trust Region Policy Optimisation

Photo of Cheng Yu
Hosted By
Cheng Y. and 2 others
Deep RL: Trust Region Policy Optimisation

Details

As many of you have in experience, if the update step is too big, the model does not learn effectively. Trust Region Policy Optimisation deals with the question how small is the gradient trustworthy in the neighbourhood of the parameter values at the current step.

Paper:
https://arxiv.org/abs/1502.05477

TRPO (Trust Region Policy Optimisation):
Part 1
https://medium.com/@jonathan_hui/rl-trust-region-policy-optimization-trpo-explained-a6ee04eeeee9

Part 2
https://medium.com/@jonathan_hui/rl-trust-region-policy-optimization-trpo-part-2-f51e3b2e373a

The math behind TRPO, PPO, and Natural Gradient method:
https://medium.com/@jonathan_hui/rl-the-math-behind-trpo-ppo-d12f6c745f33

Recommend to bring:
Laptop (for following the discussion about the paper)

This event is sponsored by Trellis Data (http://www.trellisdata.com.au).

Photo of Canberra Deep Learning Meetup group
Canberra Deep Learning Meetup
See more events
C&Ma House
22 Napier Cl · Deakin, AC