Machine Learning on Big Data (talks from Lyft, Netflix and Walmart Labs)

This is a past event

870 people went

Location image of event venue


In this meetup, we will focus on the art and science of doing Machine Learning on Big Data. We will have talks on best practices for ML models, and then dive deep into what a scalable ML infra looks like. It’s an evening not to be missed!

Food and Drinks sponsored by Lyft

6:00 - 6:30 pm: Check in, food, networking
6:30 - 6:35 pm: Intros
6:35 - 8:30 pm - 3 Talks
8:30 - 8:45 pm - Wrap up

Important Note: It is required to register for the event (free) on, before the event. You will then be sent an eNDA which needs to be signed 24 hours before the event, for security reasons. A badge would be pre-printed for you when you arrive at the event. Please register here ( If for some reason you are not able to sign the eNDA online, you can still attend, however you may have a wait in a long line at the sign in desk.

Talk #1: Ridesharing - Accounting for uncertainty in dispatch decisions to optimize marketplace balance
Dispatch is one of the most powerful levers to optimize a two-sided marketplace of physical goods, as it is able to use rider payments to reallocate supply within a network. However, uncertainty of user behavior, such as riders canceling or drivers rejecting dispatches, makes achieving perfect optimality a challenge.

In this talk, Parker discusses how Lyft has accounted for uncertainty in ride-sharing networks to achieve better overall outcomes. This talk will dive into modeling challenges with sparsity and non-continuity of various ML models, preventing moral hazard in user behavior from these assumptions, and understanding the biases different model assumptions have on the overall objective.

Speaker Bio:
Parker Spielman has extensive experience in ridesharing, both at Lyft and previously Uber, where he has worked on a variety of problems including dynamic pricing, dispatch, and incentives. All of these areas contribute to a set of levers focused on better overall control systems for real-time marketplaces.

Talk #2: More Data Science with Less Engineering: ML Infrastructure at Netflix
Netflix is known for its unique culture that gives an extraordinary amount of freedom and responsibility for individual engineers and data scientists. Our data scientists are expected to develop and operate large machine learning workflows autonomously. However, we do not expect that all our scientists are deeply experienced with systems or data engineering. Instead, we provide them with delightfully usable machine learning infrastructure that they can use to manage the whole lifecycle of a data science project.

In this talk, we will share the key concepts that has made our ML infrastructure successful at Netflix.

Speaker Bio:
Ville Tuulos manages the machine learning infrastructure team at Netflix. Prior to Netflix, Ville has been designing and leading ML and data infrastructure efforts at various startups and large companies in the Bay Area for over a decade, with a particular focus on human-centric tooling.

Talk #3: Machine learning and large-scale data analysis on a centralized platform at Walmart
In this talk, speakers explore the design of a centralized risk and abuse management platform and how this highly sophisticated platform enables dynamic and complex analytics of large-scale data from different domains. They share a study of protecting customer accounts through linking customer behaviors in their purchases, returns, and financial services.

You’ll get an introduction to the Walmart risk and abuse management platform, risk and abuse problems in the Walmart ecosystem, the data-driven analytics and advanced machine learning algorithm used to defend against fraud and abuse, and case studies of customer account protection.

Speaker Bio:
James Tang is a senior director of engineering at Walmart Labs. Yiyi Zeng is a senior manager and principal data scientist at Walmart Labs. Linhong Kang is a manager and staff data scientist at Walmart Labs.