Title: Anomaly Detection
Speaker: Ted Dunning
Abstract:
The basic ideas of anomaly detection are simple. You build a model and you look for data points that don’t match that model. Building a practical anomaly detection system requires deal with practical details starting with algorithm selection, data flow architecture, anomaly alerting, user interfaces and visualizations. We will describe the major classes of anomaly detection systems and show how to build anomaly detection systems for:
a) rate shifts to determine when events such as web traffic, purchases or process progress beacons shift rate
b) topic spotting to determine when new topics appear in a content stream such as Twitter
c) network flow anomalies to determine when systems with defined inputs and outputs act strangely.
While describing how to solve these problems, we will describe how clustering, dimensionality reduction, and density estimation can be used in systems that adapt and learn about their environment and how these systems can tell you when something has changed.
Bio:
Ted Dunning, Chief Application Architect @ MapR
Ted has held Chief Scientist positions at Veoh Networks, ID Analytics and at MusicMatch (now Yahoo Music). Ted is responsible for building the most advanced identity theft detection system on the planet, as well as one of the largest peer-assisted video distribution systems and ground-breaking music and video recommendations systems.
Ted has 15 issued and 15 pending patents and contributes to several Apache open source projects including Hadoop, Zookeeper and Hbase™. He is also a committer for Apache Mahout. Ted earned a BS degree in electrical engineering from the University of Colorado; a MS degree in computer science from New Mexico State University; and a Ph.D. in computing science from Sheffield University in the United Kingdom. Ted also bought the drinks at one of the very first Hadoop User Group meetings.