Skip to content

Introduction to Spark: RDDs, DataFrames and streaming

Photo of Trond Bjerkestrand
Hosted By
Trond B.
Introduction to Spark: RDDs, DataFrames and streaming

Details

Tuesday 19th of April, we're happy to welcome, for the first time, a true lightbender. Luc Bourlier from Lightbend (formerly Typesafe) will give a introductory presentation on Spark.

Luc has been working on the JVM since 2002, first for IBM on the Eclipse project, in the Debugger team, where he wrote the expression evaluation engine. After a few other Eclipse projects, he went to TomTom, to recreate their data distribution platform for over-the-air services. He joined Typesafe in 2011 to work on the Eclipse plugin for Scala. Luc then switched to the Spark team, with a focus on deployment and interaction with other frameworks.

This presentation will give an introduction to the basic elements of Spark through a live coding session. Luc will talk about RDD's, the original abstractions in Spark and then talk about the new DataFrame / Dataset abstraction introduced in the latest version of the framework. He will then show how these abstractions can be used for fast data processing with streaming support.

----

Une introduction aux élément de base du framework Spark, à travers une presentation basé sur du live-coding. La présentation parcourt l'abstraction originel de Spark, les RDDs. Puis décrit les DataFrame/Dataset, la nouvelle abstraction ajouté dans les dernières version de Spark. Et finalement, montre comment ces abstractions sont utilisées pour faire du traitement rapide de données avec le support de streaming.

Photo of Scala Romandie group
Scala Romandie
See more events
4 Rue de la Prairie · Genève