Apache Spark 101: Introduction and What's New


Details
Join us to learn about the basics and the latest advances in Apache Spark!
Agenda:
5:30-6pm Food and Drinks
6-6:45pm Presentation with Q&A
6:45-730pm Networking
Spark 101
With the rapid adoption of Apache Spark—one of the most active Apache projects today—and the need for programs to solve the world’s greatest problems, distributed computing has resurfaced as a hot commodity that can take your career to the next level. More importantly, Spark opens the door to some really cool and impactful applications.
Spark is a leap forward in distributed computing, allowing you to perform faster and more complex analyses on your Hadoop cluster and in the cloud. This presentation will introduce developers to basic Spark concepts such as DAGs, RDDs, transformations, actions, and executors. We will also cover recent developments in the Spark community with DataFrames, Streaming, SQL on Spark, and more.
Speaker Bio: Sean Mackrory has been a Software Engineer at Cloudera for 3 1/2 years, previously working with Apache HBase and Apache ZooKeeper at Qualtrics Labs. He is a member of the PMC for Apache Bigtop and Apache Sentry (incubating). He maintains Cloudera Live and the QuickStart VM, and is currently working on making Apache Spark and the rest of Cloudera's 'Big Data' stack run faster and more reliably on cloud platforms.
Location Notes: Please find free parking at this location off of DTC Blvd., but avoid the reserved and 2 hour max parking spots.

Apache Spark 101: Introduction and What's New