Skip to content

Analyzing Trafi data with Apache Spark, Zeppelin and d3.js

Photo of Ossi Bister
Hosted By
Ossi B.
Analyzing Trafi data with Apache Spark, Zeppelin and d3.js

Details

We will continue our big data journey with a hands-on workshop. The workshop will cover a basic workflow from raw data to valuable information using the tools mentioned in the invitation. Our final goal is to calculate the approximation of passenger car carbon emissions per town and to visualize it with a choropleth map. The calculation is based on population size, odometer readings and car emissions.

Topics

  1. Downloading trafi data

  2. Loading data using spark-csv library

  3. Data filtering

  4. Quality checks

  5. Basic statistics

  6. Queries using SQL syntax

  7. How to visualize data and make dynamic queries inside Zeppelin notebook

  8. Parquet files: save and open

  9. Data joins using dataframes

  10. Spark Window functions

  11. Export data

  12. How to make basic charts with d3.js

  13. gdal, topojson

Photo of Helsinki Hadoop Meetup group
Helsinki Hadoop Meetup
See more events
Aalto Design Factory
Betonimiehenkuja 5, 02150 · Espoo