Skip to content

Developer Meet-up: The Spark SQL Optimizer and External Data Sources API

Photo of Scott Walent
Hosted By
Scott W. and 2 others
Developer Meet-up: The Spark SQL Optimizer and External Data Sources API

Details

Live-stream Link: https://www.youtube.com/watch?v=GQSNJAzxOr8

This meet-up will be geared towards advanced users of Spark SQL, in particular those who are interested in contributing to the project. I walk through the optimization workflow, explaining how Spark SQL automatically rewrites query plans to execute more efficiently. I'll also preview the new external data sources API that is being added for 1.2 and show how we can add easily add support for reading new types of data.

Michael Armbrust is the lead developer of the Spark SQL project at Databricks. He received his PhD from UC Berkeley in 2013, and was advised by Michael Franklin, David Patterson and Armando Fox. His thesis focused on building systems that allow developers to rapidly build scalable interactive applications, and specifically defined the notion of scale independence. His interests broadly include distributed systems, large-scale structured storage and query optimization.

Photo of Bay Area Spark Meetup group
Bay Area Spark Meetup
See more events
Deloitte
555 Mission St. · San Francisco, CA