Skip to content

Reinforcement Learning Working Group

Photo of doug chang
Hosted By
doug c. and 2 others
Reinforcement Learning Working Group

Details

RL Working Group
We are using the grokking RL from Miguel Morales and the Coursera RL courses.

Do the Coursera programming assignments and make your own demos and present in front of the group.

We will build on the demos from last week. Gridworld as a react app can be used for POCs and simulation to a non technical audience. We will build out the react app more to include probability distributions and comparison of value algorithms

RL: all of the fundamental RL concepts are required for any work with LLMs and Agents. The exploitation/exploration tradeoff is mentioned in the config.yaml files for OpenEvolve.

You work on your own projects and use the weekly meetups to motivate progress.

Coursera RL: 4 classes, Fundamentals, Sampling, Approximation, Capstone. Start with this first.

Stanford cs234: the derivations are in the YT videos for this class.
You will need these for the derivations and proofs. Practice on the assignments on your own. They implement PPO as a hw exercise.
https://www.youtube.com/playlist?list=PLoROMvodv4rN4wG6Nk6sNpTEbuOSosZdX

POMDPs and RL. Self driving cars, air control systems; most real world systems use POMDPs + RL. These are mentioned briefly in the grokking book but are explained in detail here: https://aa228v.stanford.edu/
The aa228v videos are on YT.

How to build LLMs: https://stanford-cs336.github.io/spring2025/
This includes relevant information in 1 place with better detail than any blog post or YT video with starter code exercises. How to train foundation models, what are MOEs, fine tuning, benchmarking, etc...

Manning RL Resources
https://www.manning.com/books/grokking-deep-reinforcement-learning

Photo of Silicon Valley Hands On Programming Events group
Silicon Valley Hands On Programming Events
See more events
FREE