How to Find What You Didn't Know to Look For, Practical Anomaly Detection

Anomaly detection is the art of automating surprise. To do this, we have to be able to define what we mean by normal and recognize what it means to be different from that.

The basic ideas of anomaly detection are simple. You build a model and you look for data points that don’t match that model. The mathematical underpinnings of this can be quite daunting, but modern approaches provide ways to solve the problem in many common situations.

I will describe these modern approaches with particular emphasis on several real use-cases including:

a) rate shifts to determine when events such as web traffic, purchases or process progress beacons shift rate

b) time series generated by machines or biomedical measurements

c) topic spotting to determine when new topics appear in a content stream such as Twitter

d) network flow anomalies to determine when systems with defined inputs and outputs act strangely.

In building a practical anomaly detection system you have to deal with practical details starting with algorithm selection, data flow architecture, anomaly alerting, user interfaces and visualizations. I will show how to deal with each of these aspects of the problem with an emphasis on realistic system design.

Ted Dunning, MapR Chief Applications Architect

Ted Dunning is Chief Applications Architect at MapR Technologies and committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects and mentor for Apache Storm. He contributed to Mahout clustering, classification and matrix decomposition algorithms and helped expand the new version of Mahout Math library. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems, he built fraud detection systems for ID Analytics (LifeLock) and he has issued 24 patents to date. Ted has a PhD in computing science from University of Sheffield. When he’s not doing data science, he plays guitar and mandolin.


Join or login to comment.

  • Ryan

    Was there video for this talk?

    September 22

  • Bob P.

    Really enjoyed this talk even without seeing any code...then I found some code on Ted's github. Wanted to share for anyone looking to do some tinkering: https://github.com/tdunning/anomaly_detection

    1 · September 17

  • Hugh D.

    I did not expect the talk to range from entropy to deep learning. And what an utterly fascinating guy. Thanks for organizing this.

    1 · September 17

  • Michael L.

    Fantastic talk this evening. It's refreshing to have a complex topic presented in such a digestible way.

    I also wanted to share again the DevOpsDays Chicago information I spoke about briefly. The event is October 7th & 8th at the Willis Tower. We're super excited about the program, which you view here: http://devopsdays.org/events/2014-chicago/program/. Use promo code DEVOPSDAYS_CJUG to receive 10% registration.

    Hope to see you there so we can talk about everything from how to propel culture change to monitoring to tool-chain tips & tricks. :)

    2 · September 16

  • Cedric H.

    Bummer, just realized this conflicts with driving down to Strangeloop in preparation for our Elasticsearch workshop on Wednesday. I'll try to arrange for video recording ahead of time.

    August 25

  • Steven

    This sounds absolutely freaking amazing!

    1 · August 25

Our Sponsors

People in this
Meetup are also in:

Imagine having a community behind you

Get started Learn more
Henry

I decided to start Reno Motorcycle Riders Group because I wanted to be part of a group of people who enjoyed my passion... I was excited and nervous. Our group has grown by leaps and bounds. I never thought it would be this big.

Henry, started Reno Motorcycle Riders

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy