Lightning Fast Big Data Analytics with Apache Spark


Details
Apache Spark is taking the big data world to the next level in terms of processing speed and ease of use.
It generalizes the Map-Reduce model using an in-memory distributed computing paradigm that leverages a lazy functional API. Spark provides developers and data scientists with a concise API that can be used programmatically to create data-processing jobs, or interactively in order to explore the data and speed up the analysis process.
Thanks to an ever growing ecosystem, Spark can be used in batch or streaming mode in the language we love, including Java, Scala, Python and even R. The new kid in the block announced recently being SQL.
In this talk we are going to introduce Spark, its core concepts, deployments options and will share our experience with it, using real-world examples to illustrate the possibilities it opens to analyse data in batch and real-time.
We will be using materials prepared for Devoxx (BE) 2014 with an emphasis on the Scala roots of Spark, improved demo and plenty of time for Q&A.
Speakers
Andy, aka @noootsab (http://twitter.com/noootsab), is a mathematician turned into a distributed computing engineer, mainly in the Geospatial world.
When the Big Data age came in, he decided to enjoy it at most and created NextLab, a Big/Smart Data oriented company.
Since then, he had fun working for IoT, Genomics, Automotive and Smart cities projects. Building Spark jobs, feeding Cassandra rings and shooting data with machine learning guns.
He's also a certified Scala/Spark trainer and wrote the Learning Play! Framework 2 book for Packt Publishing.
Gerard, aka @maasg (http://twitter.com/maasg), is the lead of the Data Processing Team at Virdata.com where he and his team work on building and extending the data processing pipeline for Virdata's IoT cloud platform.
He has a background in Computer Science and is a former Java geek now converted to Scala. Through his career in technology companies like Alcatel-Lucent, Bell Labs and Sony he has been mostly involved in the interaction of back-end services and devices, which has now converged in his IoT focused work at Virdata (http://www.virdata.com/).

Lightning Fast Big Data Analytics with Apache Spark