Spark MLlib in Production and a sneak peek into Apache Spark 1.6


Details
Session Information
We have a great double session talking about Spark MLlib in production (WebTrends case study) and a sneak peek into Spark 1.6.
Session 1: A Sneak Peek into Spark 1.6: From RDD to DataFrames to Datasets
Denny Lee, Technology Evangelist at Databricks, will provide a sneak peek into Apache Spark 1.6 - from RDD to DataFrames to Datasets.
Spark 1.6 will include (but not limited to) adaptive query execution [SPARK-9850], a type-safe API called Dataset on top of DataFrames that leverages all the work in Project Tungsten to have more robust and efficient execution (including memory management, code generation, and query optimization) [SPARK-9999], and unified memory management by consolidating cache and execution memory [SPARK-10000].
Session 2: A Journey to the Center of Big Data with Spark at WebTrends
Webtrends is a 20-year old company and we have done it all when it comes to data - Webtrends did big data before there was Big Data.
With the vast amount of sources from which to ingest data, Webtrends is in a unique position as we deal with all different shapes and sizes of datasets collected on behalf of our clients. We are stewards of data for more than 2,000 clients (generating more than 13 billion transactions per day) and make it available to those clients in many different forms. That’s a lot of data to deal with – and, we have 20-years’ worth of it! So this is where we begin our journey to the center of the data in a Big Data world.
In this session, Peter Crossley, Ethan Dereszynski, and Sean McNamara will talk about their experiences building the WebTrends Big Data platform with Spark.
Speakers
Peter Crossley, Director, Product Architecture & Technology
Ethan Dereszynski, Research Scientist – ML and predictive analytics
Sean McNamara, Architect, Spark Commiter, All around cool guy — Hands on techy things, Q&A, Spark application and the power of spark streaming.
Agenda
6:00pm: Doors Open
6:20pm: Presentation Sessions
7:40pm: Q&A
8:00pm: Fini!
Food and Drinks brought to you by WebTrends and Databricks

Spark MLlib in Production and a sneak peek into Apache Spark 1.6