Reinforcement Learning Working Group

Name: Reinforcement Learning Working Group
Start: 2025-06-18T18:30:00-07:00
End: 2025-06-18T19:30:00-07:00

Hosted By

doug c. and 2 others

Details

RL Working Group
We are using the grokking RL from Miguel Morales and the Coursera RL courses.

Do the Coursera programming assignments and make your own demos and present in front of the group.

We will build on the demos from last week. Gridworld as a react app can be used for POCs and simulation to a non technical audience. We will build out the react app more to include probability distributions and comparison of value algorithms

RL: all of the fundamental RL concepts are required for any work with LLMs and Agents. The exploitation/exploration tradeoff is mentioned in the config.yaml files for OpenEvolve.

You work on your own projects and use the weekly meetups to motivate progress.

Coursera RL: 4 classes, Fundamentals, Sampling, Approximation, Capstone. Start with this first.

Stanford cs234: the derivations are in the YT videos for this class.
You will need these for the derivations and proofs. Practice on the assignments on your own. They implement PPO as a hw exercise.
https://www.youtube.com/playlist?list=PLoROMvodv4rN4wG6Nk6sNpTEbuOSosZdX

POMDPs and RL. Self driving cars, air control systems; most real world systems use POMDPs + RL. These are mentioned briefly in the grokking book but are explained in detail here: https://aa228v.stanford.edu/
The aa228v videos are on YT.

How to build LLMs: https://stanford-cs336.github.io/spring2025/
This includes relevant information in 1 place with better detail than any blog post or YT video with starter code exercises. How to train foundation models, what are MOEs, fine tuning, benchmarking, etc...

Manning RL Resources
https://www.manning.com/books/grokking-deep-reinforcement-learning

Events in

Silicon Valley Hands On Programming Events

See more events

Silicon Valley Hands On Programming Events

No ratings yet

Online event

Link visible for attendees

Silicon Valley Hands On Programming Events

public group

Reinforcement Learning Working Group

FREE