Skip to content

Details

Last meeting we reviewed tabular solution methods and then applied them to the pocket cube problem. These methods can be directly compared to more direct search methods which find an optimal path from scratch from a starting state. Tabular methods in contrast solve for a single global solution which can be used from any state.

When the state space is too large, we must rely on approximation techniques which use gradient methods and function approximation to approach the exact tabular solution which is ideal. Throughout the book we covered many different types of approximation techniques starting with value function-based methods. Eventually we introduced policy gradient methods. We will go over the concept and implementation details for most of the approximation methods and use some example problems to compare their performance. If there's time, we'll begin to apply the most appropriate technique to the pocket cube problem and see how much complexity is required to solve it.

As usual you can find below links to the textbook, previous chapter notes, slides, and recordings of some of the previous meetings.
Useful Links:
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto
Recordings of Previous Meetings
Short RL Tutorials
My exercise solutions and chapter notes
Kickoff Slides which contain other links
Video lectures from a similar course

Members are also interested in