This event is sponsored by MapR, one of the biggest Hadoop Distributions on the market, and big contributor of Apache Hadoop projects like HBase, Pig (programming language), Apache Hive, and Apache ZooKeeper.
Hadoop has been a huge success in the data world. It’s disrupted decades of data management practices and technologies by introducing a massively parallel processing framework. The community and the development of all the Open Source components pushed Hadoop to where it is now.
That's why the Hadoop community is excited about Apache Spark. The Spark software stack includes a core data-processing engine, an interface for interactive querying, Sparkstreaming for streaming data analysis, and growing libraries for machine-learning and graph analysis. Spark is quickly establishing itself as a leading environment for doing fast, iterative in-memory and streaming analysis.
This talk will give an introduction the Spark stack, explain how Spark has lighting fast results, and how it complements Apache Hadoop.
About the Presenter
Sungwook Yoon (http://sungwookyoon.com/) is a Data Scientist at MapR.
Sungwook's data experience includes:
- Malware detection algorithms for packet stream analysis
- Mobile network signaling analysis
- Social network analysis
- Job title analysis
- Call center data analysis
Before joining MapR, Sungwook worked as a Research Scientist at Palo Alto Research Center and as an Architect in Seven Networks. Sungwook's main technical background lies in Artificial Intelligence and Machine Learning.
Sungwook holds Ph.D. degree from Purdue University and is a graduate from Seoul National University.
5:30 – 6:00 Welcome & Networking
6:00 – 7:30 Spark presentation by Sungwook Yoon
7:30 – 8:30 Networking + drinks and our signature delicious thin crust pizzas!