Skip to content

Details

Raw to valuable data using Spark, Parquet and Python - Barry Sheridan, Data Scientist, Tenable

This talk will cover Tenables approach for converting big, messy datasets into manageable, flat datasets using Spark, Parquet and Python. It will cover the workflow surrounding starting with a compressed, messy dataset and ending up with a flat clean dataset.

Along the way we will use Python to show:

  • conversion of a raw dataset to Parquet files
  • application of aggregations to Parquet files with Spark
  • example analysis of aggregated output to find valuable information

BigQuery and the evolution of data services at Google. Kirill Evreinov, Solution Engineer, Google

In his presentation, Kirill will cover the evolution of Big Data Services at Google with a focus on BigQuery.

He will contrast BigQuery to traditional data warehouse solutions and finish off his presentation with a BigQuery demo and a Q&A session.

Tenable are the sponsors of the event. And guys there will be Food available.

Doors open: 18:00

Presentations start: 18:30

Members are also interested in