Scalable, distributed machine learning with Pachyderm
The recent advances in machine learning and artificial intelligence are amazing! Yet, in order to have real value within a company, data scientists must be able to get their models off of their laptops and deployed within a company’s data pipelines and infrastructure. Those models must also scale to production size data. In this workshop, we will implement a deep learning model locally using Nervana Neon. We will then take that model and deploy both it's training and inference in a scalable manner to a production cluster with Pachyderm, an open source framework for data versioning and processing. We will also learn how to update the production model online, track changes in our model and data, and explore our results.
About Dr. Daniel Whitenack
Daniel (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He speaks at conferences around the world (ODSC, Spark Summit, Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.