Deep RL: Trust Region Policy Optimisation

Canberra Deep Learning Meetup
Canberra Deep Learning Meetup
Public group

C&Ma House

22 Napier Cl · Deakin

How to find us

We are on the upper level (Level 1) with the Trellis Data sign

Location image of event venue

What we'll do

As many of you have in experience, if the update step is too big, the model does not learn effectively. Trust Region Policy Optimisation deals with the question how small is the gradient trustworthy in the neighbourhood of the parameter values at the current step.


TRPO (Trust Region Policy Optimisation):
Part 1

Part 2

The math behind TRPO, PPO, and Natural Gradient method:

Recommend to bring:
Laptop (for following the discussion about the paper)

This event is sponsored by Trellis Data (