Skip to content

Scaling Machine Learning with Python and Dask

Photo of Kevin Kho
Hosted By
Kevin K. and 3 others
Scaling Machine Learning with Python and Dask

Details

James Lamb will present an introduction to Dask, a distributed computing framework in the PyData ecosystem. The first half of the talk will describe the current state of the project and its ecosystem including distributed data collections, cloud deployment options, distributed machine learning projects, and workflow orchestration. The second half of the talk will be a live demo showing the programming model for machine learning on Dask, with specific examples showing how to do distributed LightGBM training with Dask. The talk concludes with some discussion of the current state and roadmap for LightGBM-on-Dask, and some information on how those who are interested can contribute.

James Lamb is a software engineer at Saturn Cloud, where he works on a managed data science platform built on Dask. Before Saturn Cloud, James worked on industrial internet of things (IIoT) problems as a data scientist at AWS and Chicago-based Uptake. He is a core maintainer on LightGBM, and has contributed on other open source data science projects such as XGBoost and prefect. James holds Masters degrees in Applied Economics (Marquette University) and Data Science (University of California, Berkeley).

Photo of Orlando Machine Learning and Data Science group
Orlando Machine Learning and Data Science
See more events