Building Better AI Pipelines: From AutoML to Productionalize with Dataiku

This is a past event

82 people went

Location image of event venue

Details

At this meetup, Dataiku and Precocity will walk attendees through best practices for building more scalable, efficient data science pipelines. Highlighting AutoML and operationalization as key steps in the data science process, they will demonstrate essential tools and real-life examples of these tips in action.

AutoML Explorations: What it can and can’t do to prove value

Before investing in additional dataset acquisitions, scalable production architectures, scoring interfaces, and other operational concerns, many organizations want to prove out the value of an AI project by quickly assessing feasibility of initial approaches. AutoML capabilities can be a key productivity enhancer in these POV efforts by automating some feature engineering, preprocessing, model selection, and hyper parameter tuning. While I’ll present an overview on how best to incorporate this “tool in your toolbox”, AutoML is not a panacea, and I’ll also highlight why certain aspects of the AI workflow cannot be automated and still need the hand of a skilled data scientists.

Do you even productionalize? Simplifying the process of creating production grade models.

Getting models into production is the pinnacle of enterprise data science, but the added complexity of deployment and maintenance often results in delayed implementation or valuable models collecting dust. In this talk, I’ll simplify the process and discuss best practices by exploring the key phases of productionalizing models. These include deployment scenarios, concepts for monitoring models in production, and streamlining the refresh process. Finally, I’ll demonstrate how Dataiku enables the full lifecycle of a data science pipeline, including production, by walking through a project built to perform real-time fraud detection on credit card transactions.

Bios:

David Gillen is the Director of Data Science for Precocity where he is responsible for solution delivery, business development, and growth of the advanced analytics practice. He guides teams of machine learning engineers, visualization experts, data engineers, and AI ops specialists to deliver real value to clients through data science. David empathizes with executives and key stakeholders to deeply appreciate their problems and iterate on the right formulation of them before implementing a solution. Having consulted across many verticals, he has implemented mathematical solutions such as such as fraud detection, predictive maintenance, recommendation engines, lookalike modeling, anomaly detection, job schedule optimization, and churn modeling. He holds a Master’s Degree in Mathematics from Clemson University, and has enjoyed applying machine learning and optimization techniques to solve business problems for over 20 years.

Richard is an analytics architect with Dataiku and is based in Dallas. For the past decade, he has worked with organizations of all sizes to understand business problems and design solutions in the data science and analytics space. Before that, he acquired hands-on experience in machine learning by developing models for real estate and marketing, eventually becoming the director for the data science department. Richard has a degree in Economics from Baylor University.