Building Production Grade Data Science with Data Version Control.
AI is eating the world...is old news. AI is no longer a proof of concept, and must be held to engineering standards.
Companies that aren’t able to scale up their machine learning operations will lose to those who can. The tools of the future will handle the dirty work so you can focus on your product, while retaining the ability to scale your organization easily.
Software engineering has accumulated wisdom and tools to handle these issues, and we need to utilize equivalent tools that are designed for the data science workflow.
In this talk, we will give an overview of the data science workflow, and focus on the challenge of reproducibility and versioning. We will present DVC, an open source tool built to handle these issues in data science projects. We will explain the simple operating principles of the tool, and the benefits gained from using it.
The second part of the event will be a 90 minute workshop. You will create a simple machine learning project, based on the tutorial we created in DAGsHub.com. This tutorial showcases the main features of DVC to create a versioned and reproducible data science project.
** To ensure your participation, please fill out this form:
This session will be held by Dean Pleban & Guy Smoilovsky.
Dean has led technological system integrations for large organizations and has studied machine learning and quantum information in the Hebrew University. Guy has over 10 years of experience as a developer, focusing on backend, big data, machine learning and DevOps. Together, they co-founded DAGsHub, the home for data science collaboration. They have interviewed many machine learning focused companies and want to share their inferences with the world.
Hope to see you all,