Profiling and Caching Spark applications with Qubole OSS

Details

We're super happy to announce that Tom and Russell, both from Qubole, will come all the way from London to show us a bunch of applications that they just open-sourced to make our life with Spark easier.

The first one is Sparklens (https://github.com/qubole/sparklens), a profiling tool to detect job scalability issues.
The second one is Rubix (https://github.com/qubole/rubix), a caching framework to speed-up cloud applications.

Don't miss it! Next Thursday 19th of February, 19:00 @ Trovit offices (thanks for the venue guys!)

Talk #1 Sparklens - Understanding the Scalability Limits of Spark Applications

Sparklens, an OSS profiling tool, provides insights about the scalability limits of a given Spark application. In this talk, we will describe the theory behind Sparklens and how it works. We will talk about how the structure of spark application puts important constraints on its scalability. How can we find these structural constraints and how to use these constraints as a guide in solving performance and scalability problems of spark applications.

Talk # 2 Rubix - An OSS Cache Framework for Cloud Platforms

RubiX is a light-weight data caching framework that can be used by Spark and Presto. RubiX is designed to work with cloud storage systems like AWS S3 and Azure Blob Storage. In this talk, we provide the details behind Rubix with the performance gains seen across Spark and Presto.

Bio:

Russell is a big data solution architect at Qubole. He has over 20 years experience of big data, starting out with big IBM mainframe systems and progressing to UNIX/Linux clusters and cloud computing. During those 20 years, Russell has experienced multiple roles including programmer, DBA, manager, solution architect and commercial user group orchestrator. Technically, Russell is an expert with SQL technologies, data distribution and data integration.

Tom Mack is the RVP of EMEA at Qubole; the world’s leading Big Data Activation Platform. Qubole is revolutionising the way companies activate their data—the process of putting data into active use across their organisations. Tom has been with Qubole since the very beginning, helping leading companies leverage their data for business outcomes. As part of Qubole’s growth, Tom made the move to London in 2017 to set up the EMEA business. Tom has a solid background working for disruptive Data and ML companies for the last 10 years, and holds a Batchelor of Science, Finance degree from Michigan State University.