Skip to content

Details

http://photos1.meetupstatic.com/photos/event/5/1/d/f/600_444860959.jpeg

Agenda:

• Introduction to Apache Spark

• High level introduction to Apache Spark with some background history

• Resilient Distributed Datasets (RDD)

• RDD overview

• RDD Lazy evaluation

• RDD Transformation and other actions

• Demo using a well known example of parsing Apache Logs in standalone machine

Suggested Reading prior to attending:

We will use the well documented example at

https://databricks.gitbooks.io/databricks-spark-reference-applications/content/logs_analyzer/chapter1/spark.html

Speaker: Architect from ZeMoSo Labs

Session Sponsored by ZeMoSo Labs

Members are also interested in