Skip to content

Google BigQuery for Data Warehousing. Raw to valuable data using Spark.

Photo of Kristijan Berta
Hosted By
Kristijan B. and Uli B.
Google BigQuery for Data Warehousing. Raw to valuable data using Spark.

Details

Raw to valuable data using Spark, Parquet and Python - Barry Sheridan, Data Scientist, Tenable

This talk will cover Tenables approach for converting big, messy datasets into manageable, flat datasets using Spark, Parquet and Python. It will cover the workflow surrounding starting with a compressed, messy dataset and ending up with a flat clean dataset.

Along the way we will use Python to show:

  • conversion of a raw dataset to Parquet files
  • application of aggregations to Parquet files with Spark
  • example analysis of aggregated output to find valuable information

BigQuery and the evolution of data services at Google. Kirill Evreinov, Solution Engineer, Google

In his presentation, Kirill will cover the evolution of Big Data Services at Google with a focus on BigQuery.

He will contrast BigQuery to traditional data warehouse solutions and finish off his presentation with a BigQuery demo and a Q&A session.

Tenable are the sponsors of the event. And guys there will be Food available.

Doors open: 18:00

Presentations start: 18:30

Photo of Data Engineering and Data Architecture Group (DEDAG) group
Data Engineering and Data Architecture Group (DEDAG)
See more events