Skip to content

Seattle Scalability Meetup - Evolution of Machine Learning Sys w/ Stripe Radar

Photo of Bradford Stephens
Hosted By
Bradford S.
Seattle Scalability Meetup - Evolution of Machine Learning Sys w/ Stripe Radar

Details

This meetup focuses on engineering large systems for Scalability. Our current focus is on technologies to do data science at scale: Distributed Systems, machine learning, AI, Blockchain, databases, and more!

We are heavily focused on deep, technical talks. No marketing pitches, no light use case discussions, no pitches. We want to see architecture diagrams, code, and hear real stories from the trenches.

Besides distributed systems and Big Data, we're also interested in hearing about high-performance engineering techniques and futuristic technologies.

We've had great success in the past, and are growing quickly! Previous guests were from Facebook, Twitter, LinkedIn, Amazon, Cloudant, Microsoft, MongoDB, and others. We love hearing from practitioners.

This month's guests:

Michael Manapat -- Evolution of the machine learning systems behind Stripe Radar

Stripe processes billions of dollars in payments a year on behalf of hundreds of thousands of businesses. Stripe Radar ( https://stripe.com/radar )—our anti-fraud product— uses machine learning to detect and stop fraudulent transactions, preemptively blocking those that look risky.

Our modeling workflow involves the typical "data science" tools: R and IPython for exploratory analysis, Hadoop for batch data processing, and scikit-learn for model building. However, Stripe's production backend uses Ruby and MongoDB, and this has introduced difficulties for both model training and production scoring. Among them: How do we generate features in an efficient, distributed manner with all of our data in Mongo (particularly features that are not just simple transformations of structures that already exist in Mongo)? How do we score in production given models that are developed using tools and frameworks that aren't available in our production environment?

In this talk, I'll describe these and other problems we’ve faced, how the our answers to these questions have evolved—from error-prone code duplication in Ruby and Scala to our current lambda architecture—and what we hope our ML infrastructure will look like in the future.

Michael Manapat leads development on machine learning products at Stripe. Prior to Stripe he was a Software Engineer at Google.

Our format is flexible: We usually have 2 speakers who talk for ~30 minutes each and then do Q+A plus discussion (about 45 minutes each talk) finish by 8:45.

There'll be beer afterwards, of course!

After-beer Location: F.X. McRoy’s for post-meetup drinks. It’s a couple blocks away at 419 Occidental Ave S, Seattle, WA 98104. c/o Greythorn

Doors open 30 minutes ahead of show-time. Please show up at least 15 minutes early out of respect for our first speaker.

Photo of Seattle Scalability Meetup group
Seattle Scalability Meetup
See more events
Realself @ Capitol One Building
83 S King St (8th Floor) · Seattle, WA