Oscar Boykin on Reproducible Machine Learning with Functional Programming


Details
IMPORTANT: for Twitter building requirements we need RSVPs with matching photo IDs at
https://sfscala.splashthat.com
-- please RSVP there, being on the meetup list alone will not get you in!
In machine learning reproducibility is very important. We want to be able to reproduce feature values, trained models, and of course model scores. Functional programming offer powerful tools in our quest for reproducible machine learned systems. In this talk we will see two tools in Stripes toolbox in the theme of reproducibility: 1. our use of the bazel build tool with scala, 2. semblance, our functional-reactive-programming (FRP) based system for feature engineering. We will see how a bazel scala build looks, and hear about how its focus on reproducibility enables build caching which can dramatically lower CI times. We will see how formulating feature engineering as an event-based FRP model rules out an important class of errors which can sabotage your model performance on real data.
Oscar is a mathematical hacker at Stripe. Previously he was at Twitter, creating Scalding, Summingbird, and many other things.
We also got a second talk on Bazel Build by its Google lead Ulf Adams, overviewing Bazel.

Oscar Boykin on Reproducible Machine Learning with Functional Programming