Skip to content

Details

Last meeting we covered joint-action learning algorithms in Chapter 6 of Multi-Agent Reinforcement Learning: Foundations and Modern Approaches. There are some limitations of joint-action learning that prevent it from representing a game theory equilibrium policy in all cases. With policy-based learning, we directly learn the parameters of a parameterized policy which can converge to an equilibrium under the right conditions. We will derive the dynamical systems equations that define the learning behavior in the simplest case of 2 agents and two actions. Then we will introduce the full WoLF-PHC algorithm and show how it performs on a few examples including the Rock-Paper-Scissors game and the two player soccer game.

As usual you can find below links to the textbook, previous chapter notes, slides, and recordings of some of the previous meetings.

Meetup Links:
Recordings of Previous RL Meetings
Recordings of Previous MARL Meetings
Short RL Tutorials
My exercise solutions and chapter notes for Sutton-Barto
My MARL repository
Kickoff Slides which contain other links
MARL Kickoff Slides

MARL Links:
Multi-Agent Reinforcement Learning: Foundations and Modern Approaches
MARL Summer Course Videos
MARL Slides

Sutton and Barto Links:
Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto
Video lectures from a similar course

AI summary

By Meetup

Reinforcement Learning meetup (MARL focus) for students and researchers; outcome: learn MARL foundations and gain access to recordings and slides.

Related topics

AI Algorithms
Artificial Intelligence
Machine Learning
Education & Technology
Game Theory

You may also like