Saltar al contenido

Season premiere with Reynold Xin, Co-founder & Chief Architect at Databricks

Foto de Ferran Galí i Reniu
Hosted By
Ferran Galí i R. y 2 más
Season premiere with Reynold Xin, Co-founder & Chief Architect at Databricks

Detalles

It is a great pleasure to have Reynold Xin (http://rxin.github.io/) with us in the first talk of this new season. He is not only co-founder and Chief Architect at Databricks, but he is also the most active contributor (https://github.com/apache/spark/graphs/contributors) of Apache Spark. We look forward to meeting you all there!

The schedule is the following:

• 18:30 to 19:00 - Reception at the entrance

• 19:00 to 20:00 - Reynold Xin’s talk

• 20:00 to 20:30 - Networking & Beers

Title:

A behind the scenes look into Spark's API and engine evolutions

Abstract:

Apache Spark is the most popular open source project in big data. While many users initially came to Spark for its performance, they stayed for the expressiveness of the APIs and ease-of-use of the engine.

In this talk, I will look back at the history of data processing software, from file systems, hierarchical databases, relational databases, big data systems (e.g. MapReduce), to "small data" systems (e.g. R, Python). I will examine the pros and cons of these different systems, the abstractions they provide, and the engines underneath. I will then discuss lessons we can learn from this evolution, how Spark is developed in this context, and a peak into the future of Spark.

Bio:

Reynold Xin is a co-founder and Chief Architect at Databricks, where he oversees the company's Spark development. He was the release manager for Spark's 2.0 release, and the driver behind most of the major recent changes in Spark, e.g. DataFrame API, Project Tungsten. Prior to Databricks, he was pursuing PhD research at the UC Berkeley AMPLab, where he worked on large-scale data processing.

Photo of Barcelona Spark Meetup group
Barcelona Spark Meetup
Ver más eventos
Casa Convalescència
c/Sant Antoni Maria Claret 171 · Barcelona