Skip to content

Details

We will continue our big data journey with a hands-on workshop. The workshop will cover a basic workflow from raw data to valuable information using the tools mentioned in the invitation. Our final goal is to calculate the approximation of passenger car carbon emissions per town and to visualize it with a choropleth map. The calculation is based on population size, odometer readings and car emissions.

Topics

  1. Downloading trafi data

  2. Loading data using spark-csv library

  3. Data filtering

  4. Quality checks

  5. Basic statistics

  6. Queries using SQL syntax

  7. How to visualize data and make dynamic queries inside Zeppelin notebook

  8. Parquet files: save and open

  9. Data joins using dataframes

  10. Spark Window functions

  11. Export data

  12. How to make basic charts with d3.js

  13. gdal, topojson

Members are also interested in