Spark 0 to Prod in 30 days; Leverage Hadoop 2.0 and YARN with Native tools


Details
Session Overview:
#1: Leverage Hadoop 2.0 and YARN with native tools by RedPoint Global CTO George Corugedo
#2: Spark: 0 to Production in 30 days by Localytics Software Engineer Pete Gamache
Rough schedule:
6:00 to 6:20 - Network
6:20 to 7:00 - Session 1 + Q&A
7:00 to 7:40 - Session 2 + Q&A
Session Detail:
Session 1: Leverage Hadoop 2.0 and YARN with native tools.
RedPoint Global of Wellesley, MA. CTO George Corugedo will discuss how to leverage Hadoop 2.0, YARN and RedPoint to create a Data Lake or Data Repository. Structured and Unstructured Data needs to be captured, cleansed and linked consistently. George will mention examples of Hadoop moving from small experimental clusters to large scale business critical components of modern data architecture.
Session 2: Spark: 0 to Production in 30 days @ Localytics
At Localytics, we're using Apache Spark as part of a new data processing pipeline. Pete Gamache will define the problem Localytics needed to solve, explore the different types of Spark installations and their strengths and weaknesses, and discuss how Localytics got up and running with Spark Standalone to process billions of events across millions of users per day.

Spark 0 to Prod in 30 days; Leverage Hadoop 2.0 and YARN with Native tools