Introduction to Apache Spark
Details
http://photos1.meetupstatic.com/photos/event/5/1/d/f/600_444860959.jpeg
Agenda:
• Introduction to Apache Spark
• High level introduction to Apache Spark with some background history
• Resilient Distributed Datasets (RDD)
• RDD overview
• RDD Lazy evaluation
• RDD Transformation and other actions
• Demo using a well known example of parsing Apache Logs in standalone machine
Suggested Reading prior to attending:
We will use the well documented example at
https://databricks.gitbooks.io/databricks-spark-reference-applications/content/logs_analyzer/chapter1/spark.html
Speaker: Architect from ZeMoSo Labs
Session Sponsored by ZeMoSo Labs
