How to Find What You Didn't Know to Look For, Practical Anomaly Detection


Details
We are pleased to announce that Ted Dunning will be our featured January speaker. He's a well know author and architect who is deeply involved with MapR, the Apache Mahout, Drill, and Zookeeper projects, Storm, Spark, recommendation systems, and fraud detection systems. Ted will be signing some of this books and you'll get a chance to meet with him prior to his talk.
Please note that a valid id and registration is required for building access.
Talk Abstract:
How to Find What You Didn't Know to Look For, Practical Anomaly Detection
Anomaly detection is the art of automating surprise. To do this, we have to be able to define what we mean by normal and recognize what it means to be different from that. The basic ideas of anomaly detection are simple. You build a model and you look for data points that don’t match that model. The mathematical underpinnings of this can be quite daunting, but modern approaches provide ways to solve the problem in many common situations.
I will describe these modern approaches with particular emphasis on several real use-cases including: a) rate shifts to determine when events such as web traffic, purchases or process progress beacons shift rate b) time series generated by machines or biomedical measurements c) topic spotting to determine when new topics appear in a content stream such as Twitter d) network flow anomalies to determine when systems with defined inputs and outputs act strangely. In building a practical anomaly detection system you have to deal with practical details starting with algorithm selection, data flow architecture, anomaly alerting, user interfaces and visualizations. I will show how to deal with each of these aspects of the problem with an emphasis on realistic system design
BIO: Ted Dunning is Chief Applications Architect at MapR Technologies and committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects and mentor for Apache Storm, DataFu, Flink and Optiq projects.
http://photos4.meetupstatic.com/photos/event/7/3/a/b/600_431489611.jpeg
Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems. He built fraud detection systems for ID Analytics (LifeLock) and he has 24 patents issued to date and a dozen pending. Ted has a PhD in computing science from the University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. He also bought the beer at the first Hadoop user group meeting.

Sponsors
How to Find What You Didn't Know to Look For, Practical Anomaly Detection