Skip to content

From Excel via SQL to MapReduce and SparkSQL by Carsten Langer

Photo of Till Westermann
Hosted By
Till W. and Thomas W.
From Excel via SQL to MapReduce and SparkSQL by Carsten Langer

Details

Please note the new location

Abstract:This presentation is for you, if you feel comfortable to use Excel and basic SQL for your BI-type of data analysis, and you now wonder:

How does that hype on MapReduce/Hadoop/Spark help your case?

How would your analysis transform to parallel computation?

At the example of two simple BI applications I'll briefly show how Excel and SQL do the calculations under the hood, and their resulting limitations. An overview on scaling-up vertically vs. scaling-out horizontally leads to parallel computations. For this I'll show how the MapReduce or corresponding SparkSQL calculations work on a simplified level, not on the normal abstract "word count" example, but on the same simple BI application. This analogy shall help you to better understand the general concept.

An expectation what stays the same for you as an analyst and what will change closes the presentation.

Bio:
Carsten Langer is a Solution Architect at Nokia, Düsseldorf, working on how to use Cloud and Big Data technologies for Mobile Network and End User Service optimization services.

Photo of Düsseldorf Data Science Meetup group
Düsseldorf Data Science Meetup
See more events