Finding scaling laws for Reinforcement Learning
Details
Neural scaling laws have received a lot of attention in recent ML research, ever since it was discovered that generative language models improve their performance as a power law of available resources. Since then, power-law scaling laws seem to pop up in every field and setting imaginable. These laws provide clear instructions on how to train large models on multi-million dollar budgets, and have directly guided the creation of SOTA models like GPT-3. Despite all this, Reinforcement Learning had until recently almost no record of power-law scaling.
In this talk, Oren Neumann will explain how his team found power-law scaling laws for AlphaZero, why any previous attempt to find these laws in RL failed, and how to train a model efficiently when a single training attempt costs several million dollars.
Join Bugout Slack community to connect with fellow data scientists, ML engineers and researchers: https://join.slack.com/t/bugout-dev/shared_invite/zt-1p4n6cgma-PtwQ0C43gtZEgqfDLPUo8Q
