From fraud detection to predictive maintenance, looking for abnormal behavior through data has become crucial in many areas. Classical machine learning frameworks assume that the existence of those anomalies is explicitly defined in the datasets, however, there are many cases where this information is not available.
In this so-called unsupervised context, the data scientist then needs to resort to appropriated algorithms that were specifically made for dealing with non-labeled data. Unfortunately, applying those methods is not as straightforward as in the supervised case and requires a different approach. In this presentation, we will introduce several unsupervised anomaly detection algorithms, provide a broad overview of their internals, and explain how to leverage them efficiently so that the anomaly detection problem can be solved from the technical and the business perspective.
Harizo has been working at Dataiku for one year, working with customers from financial services and CPG. Before joining Dataiku, he worked at the French Alternative Energies and Atomic Energy Commission, where he focused on designing statistical methods to reconstruct air pollution sources with operational applications in accidental cases and in emission monitoring contexts. He holds a PhD in Mathematics from University Lille 1.
Building on Harizo's presentation, we will look at a practical use case of anomaly detection. In this scenario, working together with a partner, we were provided flight tracking data across a region in Europe, near a major airport. Going first through exploratory steps, we will walk you through the definition of our anomaly detection case, how this is achieved using time series data, as well what are the best way of putting such a model into production.
Silviu is working as a data scientist at Dataiku. Coming from a business-oriented background, with an MSc in Business Analytics from the University of Manchester, he transitioned to a more technical focus while working on an optimization problem together with ARM. Before coming to Dataiku, Silviu was involved with social housing organizations to develop their data science capabilities.