Skip to content

Apache Spark Workshop

Photo of David Villegas
Hosted By
David V.
Apache Spark Workshop

Details

Half day Apache Spark from the Scratch.

Sponsor

https://secure.meetupstatic.com/photos/event/1/c/a/1/600_459127329.jpeg

Dell EMC will be providing breakfast!

Agenda:

  • 8:00am - 9:00am: Breakfast and networking

  • 9:00am: Workshop starts

-12:30pm: We should be done by this time.

We will look at:

  1. Getting Spark ready on your computer.

  2. Apache Spark Computation Model RDDs (Resilient Distributed Datasets).

  • Lazy nature of RDDs.

  • RDD transformations vs actions.

  • How to create RDD from different sources (let's play with some datasets).

  • RDD API. How to bend it to our needs.

  1. Spark SQL
  • Interoperability with RDDs.

  • SQL on top of RDDs.

  • Accessing to Tabular Data Sources.

  • Data Frames optimization on top of RDDs.

  • Distributed SQL Engine case of study.

  1. Spark Streaming.
  • Streaming API.

  • Streaming Sources.

  • Extending the API.

  • Twitter case of study

  • Interoperability with SQL and RDDs

Your are expected to follow along with code examples, so we will be helping you to install Spark locally as the first part of the sessions.

Photo of Data Science Salon | South Florida group
Data Science Salon | South Florida
See more events
FIU Modesto a Maidique Campus
11200 Sw 8th St · Miami, FL