Scale EDA & ML Workloads To Clusters & Back With Dask


Details
Speaker: Gus Cavanaugh
Abstract: While "Big Data" may be an overhyped buzzword, it's not uncommon for Python users to end up with more data than can fit on their laptops. Sampling is great, but sometimes you need to process everything. In the past, Python users didn't have much choice beyond Spark (and the fact that most data lakes were HDFS made it the standard option). But today, even the stodgiest enterprises have migrated a ton of data to cheap blob storage in the cloud. This has freed python users from the misery of the JVM (I mean, hey, it's way better to see a Python error than a JVM stack trace, right?). So as a result, tools like Dask make it much easier to scale the tools Python users love, e.g., NumPy, Pandas, Sklearn. In this talk, you'll learn how to scale your PyData workloads with minimal code changes using Dask so that you can focus on your work without having to learn a new API.
Speaker's Bio: "Big Data" & "The Cloud" promised me infinite scale. But that's not what I found when I stumbled onto a Hadoop cluster after college. What seemed so simple when the architects at my big consulting employer got out the whiteboard became much less so when I had my hands on the keyboard. I found solace in Python, specifically the Anaconda distribution, which I could run on the most archaic Windows workstation or cluster of Linux servers. Eventually, I switched from consulting to software where I thought I was helping companies deploy data science platforms but I really spent my time as an unpaid AWS/Azure consultant fighting with Kubernetes. I recently reunited with former Anaconda colleagues at Coiled, where we provide software and support for commercial and community users of Dask.
Join Zoom Meeting
https://zoom.us/j/95644494849?pwd=aUJwV09zK2JaZkRQQnQrZ3F6L2l0QT09
Meeting ID: 956 4449 4849
Passcode: 687567
One tap mobile
+13126266799,,95644494849# US (Chicago)
+16468769923,,95644494849# US (New York)
Dial by your location
+1 312 626 6799 US (Chicago)
+1 646 876 9923 US (New York)
+1 301 715 8592 US (Washington DC)
+1 253 215 8782 US (Tacoma)
+1 346 248 7799 US (Houston)
+1 669 900 6833 US (San Jose)
Meeting ID: 956 4449 4849
Find your local number: https://zoom.us/u/adpV47TZny
Join by SIP
95644494849@zoomcrc.com
Join by H.323
162.255.37.11 (US West)
162.255.36.11 (US East)
115.114.131.7 (India Mumbai)
115.114.115.7 (India Hyderabad)
213.19.144.110 (Amsterdam Netherlands)
213.244.140.110 (Germany)
103.122.166.55 (Australia Sydney)
103.122.167.55 (Australia Melbourne)
149.137.40.110 (Singapore)
64.211.144.160 (Brazil)
149.137.68.253 (Mexico)
69.174.57.160 (Canada Toronto)
65.39.152.160 (Canada Vancouver)
207.226.132.110 (Japan Tokyo)
149.137.24.110 (Japan Osaka)
Meeting ID: 956 4449 4849
Passcode: 687567

Sponsors
Scale EDA & ML Workloads To Clusters & Back With Dask