Paweł Szulc: Apache Spark - Living the post-mapreduce world


Details
IMPORTANT Please register at SkillsMatter:
https://skillsmatter.com/meetups/6987-apache-spark-living-the-post-mapreduce-world
Apache Spark - Living the post-mapreduce world
with
Paweł Szulc (http://www.rabbitonweb.com)
"Apache Spark™ is a fast and general engine for large-scale data processing." Above statement is taken from Apache Spark welcome page. It's one of those definitions that, while describing the product in one sentence and being 100 % true, tell still little to the wondering noob.
Why take interest in Apache Spark? Apache Spark promise being up to 100x faster than Hadoop MapReduce in certain scenarios. Being a super-set of map reduce it provide comprehensible programming model (familiar to everyone who is used to functional programming) and vast ecosystem of tools. In my talk I will both try to reveal secrets of Apache Spark for the very beginners and at the same time introduce more advanced concepts.
We will first quickly look at set of problems commonly known as BigData and how they were being addressed through the last ten years by Hadoop MapReduce. I will show you what were the challenges of those approaches and why the industry is now (after a decade) looking for a new solutions.
We will then move to Apache Spark. I will try to show you what was the main factor that drove its creators to introduce yet another large-scala processing engine. We will see how it works, what are its main advantages. We will also look briefly on its internals, focusing on the speed up improvements that led to Apache Spark fame.
Presentation will be mix of slides and code examples.
Paul Szulc - Software engineer, programmer, developer. Experienced with Java ecosystem. Currently having tons of fun working with Scala, Akka and Apache Spark. Humble apprentice of Functional Programming. Runs a blog
http://www.rabbitonweb.com (http://www.rabbitonweb.com/)
We will, as always, also be heading to the Slaughtered Lamb (http://www.theslaughteredlambpub.com/) pub afterwards.
**IMPORTANT READ ME TO REGISTER **
Skills Matter are hosting this event and are handling the attendance it is essential that you confirm your place at this link:
https://skillsmatter.com/meetups/6987-apache-spark-living-the-post-mapreduce-world
failure to do so may result in not obtaining a seat. Please register on the Meetup.com "I'm going" to only let the others in the group know your going.
If this is your first time to SkillsMatter, directions are: http://skillsmatter.com/go/find-us

Paweł Szulc: Apache Spark - Living the post-mapreduce world