Reinforcement Learning: Chapter 4 Dynamic Programming

Name: Reinforcement Learning: Chapter 4 Dynamic Programming
Start: 2025-07-21T17:30:00-07:00
End: 2025-07-21T19:00:00-07:00

Hosted By

Jason E.

Reinforcement Learning: Chapter 4 Dynamic Programming

Details

Dynamic programming is a collection of techniques used to solve the Bellman equations for value functions in reinforcement learning. Last chapter, we introduced the value functions and their associated recursive equations. This chapter, we apply the techniques of dynamic programming to calculate solutions to these equations for arbitrary environments. Once we have these solutions, we can easily derive policies which perform optimally in any reinforcement learning environment for which we have complete information. These solution methods are versions of generalized policy iteration which combine the calculation of a value function with a step of policy improvement. The policy improvement theorem is the key idea that justifies the process used to derive optimal policies from value functions, and we prove that theorem in this chapter as well.

As usual you can find below links to the textbook, previous chapter notes, slides, and recordings of some of the previous meetings.
Useful Links:
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto
Recordings of Previous Meetings
Short RL Tutorials
My exercise solutions and chapter notes
Kickoff Slides which contain other links
Video lectures from a similar course

Events in AI Algorithms Machine Learning

Artificial Intelligence Deep Reinforcement Learning Education

Silicon Valley Generative AI – A GenAI Collective Member

See more events

Silicon Valley Generative AI – A GenAI Collective Member

Every 2 weeks on Monday until March 7, 2026

Online event

Link visible for attendees

Silicon Valley Generative AI – A GenAI Collective Member

public group

Reinforcement Learning: Chapter 4 Dynamic Programming

FREE