Writing an RL env and RL algo from scratch


Details
In this, our inaugural meetup, we will present an overview of how to build your first RL environment and RL algorithm from scratch using Python, OpenAI gym, and Numpy.
You will build a simple RL environment (walking down a corridor) and code an agent to play the environment (using random search). You will also cover an intro to Monte Carlo learning, a version of which was used by Google/DeepMind in AlphaGo.
Location is Cross Campus DTLA.
== LIGHTNING TALKS ==
If you would like to give a 3 to 5 minute lightning talk, please contact Andy or Nick so we can reserve a time slot for you. Talks can be on any topic as long as there is some connection to RL. Thanks!
== DETAILED MEETUP AGENDA ==
-
Lightning Talks (~20min) - 3 to 5 minute presentations from attendees on topics related to RL. We have a few slots open; if you're interested in giving a talk, please contact Andy or Nick.
-
Hands-on Tutorial (~90min) - Bring a Linux or Mac laptop so you can code along. Please pre-install miniconda3, anaconda3, or virtualenv. We will use python 3.
We will cover:
(0) High level overview of basic RL components
(1) How to run pre-existing environments included in OpenAI Gym (e.g. balancing an inverted pendulum)
(2) Code our own simple environment
(3) plug existing out of the box RL algorithms into our custom-built environment. For this, we will show how to run OpenAI's implementations -- https://github.com/openai/spinningup -- of Vanilla Policy Gradient (VPG), Trust Region Policy Optimization (TRPO), and Proximal Policy Optimization (PPO).
(4) Overview of Monte Carlo RL algorithms.
(5) [time permitting] Code walk-through: how you can code up MCTS yourself.
2 hours is not enough time to cover everything in-depth so our goal is to stay high-level while presenting the components and APIs involved in getting your first end-to-end RL project coded and running. And then get to hands-on coding of an RL environment and a simple implementation of an agent that operates in the environment.
== ABOUT ANDY ==
I got my PhD at UC Berkeley in distributed systems. I'm a co-creator of the open source Mesos cluster manager project (http://mesos.apache.org/). I was working on the same research team as Matei Zaharia when he created the open source Apache Spark project (http://spark.apache.org/) and I was on the original team we built at Berkeley around Spark. I'm a cofounder of Databricks (https://databricks.com), a company we spun out of UC Berkeley in 2013. At Berkeley and at Databricks, I worked on the Spark testing framework and helped create the Spark community. For example, I led the creation of the Spark Summit - which will host over 10K attendees this year alone. I also helped organize the Bay Area Spark meetups.
== ABOUT NICK ==
I got my MS at UC Berkeley in computer science. I've spent the last decade working as a full stack engineer at various Bay Area companies. These days I work as a software consultant and split my time between the east coast and west coast. I'm interested in studying RL as a jumping off point to understanding modern machine learning techniques. That, and as a way to finally beat Space Invaders.

Writing an RL env and RL algo from scratch