Introduction to Distributed Computing with Dask


Details
As data volume continues to grow, using distributed compute frameworks is needed to scale workflows to big data. Dask is one such distributed compute framework that is built on top of the PyData stack, making it easy to adopt for Numpy/Pandas users. Even without a cluster, Dask already provides benefits such as parallel execution of compute workflows and memory spill over.
Through an interactive demo, Richard Pelgrim will talk about how to transition from Pandas to Dask, and demonstrate how to use Dask to run distribute compute workflows with hands-on examples, including machine learning examples with dask-ml. This talk will be beginner friendly.
Note that this event is not during the usual 3-4:30pm time slot
About the Speaker:
Richard Pelgrim is a Data Science Evangelist at Coiled who studied geography, anthropology and contemporary art...and somehow wound up in the magical world of data. Besides numbers, he's passionate about fermenting things, making red sauce (more garlic...always!) and contributing his technical skills to a more equitable and sustainable planet.

Introduction to Distributed Computing with Dask