Analyzing Trafi data with Apache Spark, Zeppelin and d3.js


Details
We will continue our big data journey with a hands-on workshop. The workshop will cover a basic workflow from raw data to valuable information using the tools mentioned in the invitation. Our final goal is to calculate the approximation of passenger car carbon emissions per town and to visualize it with a choropleth map. The calculation is based on population size, odometer readings and car emissions.
Topics
-
Downloading trafi data
-
Loading data using spark-csv library
-
Data filtering
-
Quality checks
-
Basic statistics
-
Queries using SQL syntax
-
How to visualize data and make dynamic queries inside Zeppelin notebook
-
Parquet files: save and open
-
Data joins using dataframes
-
Spark Window functions
-
Export data
-
How to make basic charts with d3.js
-
gdal, topojson

Analyzing Trafi data with Apache Spark, Zeppelin and d3.js