Google BigQuery for Data Warehousing. Raw to valuable data using Spark.


Details
Raw to valuable data using Spark, Parquet and Python - Barry Sheridan, Data Scientist, Tenable
This talk will cover Tenables approach for converting big, messy datasets into manageable, flat datasets using Spark, Parquet and Python. It will cover the workflow surrounding starting with a compressed, messy dataset and ending up with a flat clean dataset.
Along the way we will use Python to show:
- conversion of a raw dataset to Parquet files
- application of aggregations to Parquet files with Spark
- example analysis of aggregated output to find valuable information
BigQuery and the evolution of data services at Google. Kirill Evreinov, Solution Engineer, Google
In his presentation, Kirill will cover the evolution of Big Data Services at Google with a focus on BigQuery.
He will contrast BigQuery to traditional data warehouse solutions and finish off his presentation with a BigQuery demo and a Q&A session.
Tenable are the sponsors of the event. And guys there will be Food available.
Doors open: 18:00
Presentations start: 18:30

Google BigQuery for Data Warehousing. Raw to valuable data using Spark.