A real time streaming implementation of markov chain based fraud detection


Details
Bio
Bruce Ho is a big data enthusiast with a special interest in in-memory and streaming computing. He devoted the past year to implementation of predictive modeling and machine learning using Spark and NoSql technology. He is currently creating a data pipeline, which delivers real time user profile analysis for optimization of Ad campaigns. Look out for more of Bruce’s up coming open source projects. He was formerly a MIT/Caltech trained physicist, who later picked up java architecture, and big data. Bruce worked in the software industry for over 10 years in various companies including Life Technologies, TeraData, and Amazon. Intro
Fraud detection, especially in the context of financial transactions, is a relevant and interesting topic. Recent advances in Spark streaming analysis makes it possible to implement elegant solutions using open source tools. This sample implementation leverages past published work based on markov chain modeling but introduces a brand new system design which boosts processing speed and completes a realistic streaming infrastructure. The technologies involves in this implementation includes, Spark, hdfs, hbase, and kafka. The project is described in the blog http://bhomass.blogspot.com/2014/12/a-real-time-streaming-implementation-of.html. The accompanying github url is https://github.com/bhomass/marseille. The topic covers a large amount of material. The author encourages all attending the talk to first download and study the code.

A real time streaming implementation of markov chain based fraud detection