Multi-Agent Reinforcement Learning: Chapter 8 Deep Reinforcement Learning
Details
This meeting will continue our discussion of Chapter 8 in Multi-Agent Reinforcement Learning: Foundations and Modern Approaches. Last meeting we covered value function methods and concluded with the DQN algorithm. This time we will introduce policy gradient methods starting with the "Reinforce" algorithm. We can attempt to extend policy learning to take advantage of hardware that can process large batches of data at once just like DQN uses a replay buffer to process batches of data. We will discuss how parallel environments can be used to extend policy methods in this manner.
As usual you can find below links to the textbook, previous chapter notes, slides, and recordings of some of the previous meetings.
Meetup Links:
Recordings of Previous RL Meetings
Recordings of Previous MARL Meetings
Short RL Tutorials
My exercise solutions and chapter notes for Sutton-Barto
My MARL repository
Kickoff Slides which contain other links
MARL Kickoff Slides
MARL Links:
Multi-Agent Reinforcement Learning: Foundations and Modern Approaches
MARL Summer Course Videos
MARL Slides
Sutton and Barto Links:
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto
Video lectures from a similar course
