Foundations - RL with Finite Examples

This is a past event

72 people went

Location image of event venue


***As mentioned last session, we're going to have a new venue this month. We'll be at River City Labs (level 3) - hopefully we can land the slot permanently.***

This month's adventure into the world of AI tech has us attacking the issue of reinforcement learning in a practical setting...where the problem is hard to model and data is both observational and limited. After all, relying on infinite datasets via simulation is for wimps. (Or Google/Amazon-backed researchers, where the distinction between enormous real data and infinite simdata is pretty blurry.) This talk will be true ML engineer stuff with a stiff serve of data science, where there is no choice but to optimise your system based on domain knowledge and pray that the gods of numerical stability are on your side.

Presenting will be Sam Fogarty, who most of you will know from his participation in the community, his Kaggling and his research. He's been busy prototyping RL systems for applications in medicine, and he's graciously offered to share with us some of the tribulations and successes he's experienced along the way. Topics touched on will include specification of reward functions, the tradeoff between exploration and optimisation and the tradeoff between training stability and data efficiency. If you've ever looked at an RL talk and wondered "why isn't this the default way of doing things?", this talk should help demonstrate why some of the likely challenges are not as trivial as often presented. Conversely, if you've let the RL horror stories put you off experimenting with agent-based systems, this talk should give you some practical tools to overcome those same challenges. A working knowledge of neural nets/Pytorch will help, as we may get some live dev done.

REMINDER: RIVER CITY LABS, not ThoughtWorks. People who turn up at ThoughtWorks may incur penalties ranging from mild embarrassment to group teasing*. Kick-off will be at the normal time, but if you can make it down early to help us figure out how to set up the place, I'd appreciate an extra pair of hands or two.

* This is reserved for Khoa, as he did the hard yards securing the new venue...if he gets the address wrong, we all get to make fun of him. After we thank him, of course.