Introduction into Apache Spark


Details
This will be a workshop so bring your own device! :)
Schedule
• 18:15 - 19:00: Doors open, food & drinks
• 19:00 - 22:00: Introduction into Apache Spark (workshop)
Introduction into Apache Spark
We will give a general introduction into Apache Spark and look at ways of running it and introduce the basic Spark API. We will briefly cover Spark components: Spark Core, Spark SQL, Spark Streaming, Spark GraphX and Spark MLlib. After the short presentation you will work on practical examples.
About Apache Spark: is an open-source cluster computing framework originally developed in the AMPLab at UC Berkeley. In contrast to Hadoop's two-stage disk-based MapReduce paradigm, Spark's in-memory primitives provide performance up to 100 times faster for certain applications. By allowing user programs to load data into a cluster's memory and query it repeatedly, Spark is well suited to machine learning algorithms. - source: Wikipedia (http://en.wikipedia.org/wiki/Apache_Spark)
http://photos4.meetupstatic.com/photos/event/c/7/4/4/600_438231012.jpeg
Prerequisites:
-JDK 8
-Scala 2.10.5(or higher)
-SBT 0.13.8
-Apache Spark 1.4.0
-your favourite IDE
Workshop led by: Robert van Rijn (https://nl.linkedin.com/in/rvanrijn) from NCIM Groep & Casper Koning (https://www.linkedin.com/profile/view?id=343288808&authType=NAME_SEARCH&authToken=CBdf&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2Cidx%3A2-1-2%2CtarId%3A1433404312666%2Ctas%3Acas) from Ordina

Introduction into Apache Spark