How Apache Spark Fits Into The Big Data Landscape


Details
In conjunction with the Boulder Denver Spark Meetup (https://www.meetup.com/Boulder-Denver-Spark-Meetup/events/207581832/) we are happy to announce this great event with Paco Nathan (http://liber118.com/pxn/)! Spark (https://spark.apache.org/) is a fast engine for large scale data processing. It has really come on strong as a method for effectively processing big data recently. Some of you might remember that Paco did a presentation for us on Mesos/Cascading (https://www.meetup.com/Boulder-Denver-Big-Data/events/131047972/) a little over a year ago. Paco is a very engaging presenter and does a great job of explaining all of these new technologies. Come hear Paco speak and learn how Spark can help you and your organization.
Agenda
6:00 – 6:30 - Socialize over food and drink
6:30 – 6:45 - Announcements, Upcoming Events
6:45 – 8:30 - How Apache Spark Fits Into The Big Data Landscape by Paco Nathan
8:30 – ??? - Continued socializing
About the Speaker
Paco Nathan (http://liber118.com/pxn/) is a "player/coach" who has led innovative Data teams over the past decade, building large-scale apps. He is a recognized expert in Hadoop, R, cloud computing, distributed systems, machine learning, predictive analytics, and NLP. Paco is the Chief Scientist for Mesosphere in San Francisco, is a committer on the Cascading open source project, and is an O'Reilly author "Enterprise Data Workflows with Cascading". He received his BS Math Sciences and MS Computer Science degrees from Stanford, and has 25+ years experience in the tech industry ranging from Bell Labs to early-stage start-ups.
About the Presentation
Apache Spark is intended as a general purpose engine that supports combinations of Batch, Streaming, SQL, ML, Graph, etc., for apps written in Scala, Java, Python, Clojure, R, etc.
This talk provides an introduction to Spark — how it provides so much better performance, and why — and then explores how Spark fits into the Big Data landscape — e.g., other systems with which Spark pairs nicely — and why Spark is needed for the work ahead.
We'll review some of the new features in the 1.1 release, have a demo of notebooks in Databricks Cloud, and also discuss about the new Spark Developer Certificate program.

How Apache Spark Fits Into The Big Data Landscape