Unsupervised Anomaly Detection in Sequences using LSTM Recurrent Neural Nets


Details
For our September Data Science DC Meetup, we are excited to have Majid Al-Dosari, MS graduate of GMU Computational Science, with a background in mechanical engineering, join us to speak about unsupervised anomaly detection in sequences using long short-term memory recurrent artificial neural networks.
----------------------------
Sponsored by Statistics.com -- Learn new skills in Data Science, Analytics, and Statistics! Four-week online courses at all levels, world-class instructors, private discussion boards. Save 30% with code DSDC2016 (http://www.statistics.com/?utm_source=dsdc-meetup&utm_medium=meetup&utm_campaign=DSDC2016).
----------------------------
Agenda:
• 6:30pm -- Networking, Empanadas, and Refreshments
• 7:00pm -- Introduction, Announcements
• 7:15pm -- Presentation and Discussion
• 8:30pm -- Data Drinks (Tonic , 2036 G St NW)
Abstract:
Unsupervised Anomaly Detection in Sequences using Long Short-Term Memory Recurrent Artificial Neural Networks
Long Short Term Memory (LSTM) recurrent neural networks (RNNs) are evaluated for their potential to generically detect anomalies in sequences. First, anomaly detection techniques are surveyed at a high level so that their shortcomings are exposed. The shortcomings are mainly their inflexibility in the use of a context ‘window’ size and/or their suboptimal performance in handling sequences. Furthermore, high-performing techniques for sequences are usually associated with their respective knowledge domains. After discussing these shortcomings, RNNs are exposed mathematically as generic sequence modelers that can handle sequences of arbitrary length. From there, results from experiments using RNNs show their ability to detect anomalies in a set of test sequences. The test sequences had different types of anomalies and unique normal behavior. Given the characteristics of the test data, it was concluded that the RNNs were not only able to generically distinguish rare values in the data (out of context) but were also able to generically distinguish abnormal patterns (in context).
In addition to the anomaly detection work, a solution for reproducing computational research is described. The solution addresses reproducing compute applications based on Docker container technology as well as automating the infrastructure that runs the applications. By design, the solution allows the researcher to seamlessly transition from local (test) application execution to remote (production) execution because little distinction is made between local and remote execution. Such flexibility and automation allows the researcher to be more confident of results and more productive, especially when dealing with multiple machines.
Bio:
Majid Al-Dosari (https://about.me/majidaldosari)
Majid recently completed his (second) Master's degree in Computational Science from George Mason University. Although formally trained as a mechanical engineer, his interest in computation began at Vanderbilt University where he conducted computationally-expensive simulations of materials related to energy conversion devices. In addition, he interned with Continuum Analytics and is currently working for Enertiv analyzing building energy consumption data.
Sponsors:
This event is sponsored by the George Washington Business School MS in Business Analytics Program (http://business.gwu.edu/programs/specialized-masters/m-s-in-business-analytics/academic-program/), Statistics.com (http://bit.ly/12YljkP), Elder Research (http://datamininglab.com/), Novetta (https://www.novetta.com/), PAWGOV (http://www.predictiveanalyticsworld.com/gov/2016/), O'Reilly (http://www.oreilly.com/), Booz Allen Hamilton (https://www.boozallen.com/consulting/strategic-innovation/nextgen-analytics-data-science), and AOL (http://engineering.aol.com/). (Would your organization like to sponsor too? Please get in touch!)

Unsupervised Anomaly Detection in Sequences using LSTM Recurrent Neural Nets