Spark After Dark: Advanced analytics, streaming data, machine learning, ...


Details
Update: We're meeting at Orbitz. Make sure your name on the RSVP matches your id, for security. Also, I need a volunteer to cover Pizza and drinks!
Chris Fregly from Databricks will be in town (he grew up here, actually).
Spark After Dark is a mock dating site that uses the latest Spark libraries including Spark SQL, BlinkDB, Spark Streaming, MLlib, and GraphX to generate high-quality dating recommendations for its members and blazing fast analytics for its operators.
We begin with brief overview of Spark, Spark Libraries, and Spark Use Cases. In addition, we'll discuss the modern day Lambda Architecture that combines real-time and batch processing into a single system. Lastly, we present best practices for monitoring and tuning a highly-available Spark and Spark Streaming cluster.
There will be many live demos covering everything from basic topics such as ETL and data ingestion to advanced topics such as streaming, sampling, approximations, machine learning, textual analysis, and graph processing.
Bio
Chris Fregly is a Field Engineering Consultant at Databricks focused on advanced analytics use cases, high performance streaming data pipelines, machine learning, approximations, and probabilistic data structures - mostly in AWS and Google Cloud environments.
He's an Apache Spark Contributor and author of the upcoming books, Spark in Action and Effective Spark.
Chris has 15+ years of distributed big data systems experience across many domains including media/entertainment, banking, insurance, and travel.
Previously, Chris was a Streaming Data Platform Engineer at Netflix, Data Platform Architect at Playboy Enterprises, and a Distributed Systems Engineer at BEA Systems.

Spark After Dark: Advanced analytics, streaming data, machine learning, ...