Skip to content

Apache Spark Meetup @ Workday, SF

Photo of Jules S. Damji
Hosted By
Jules S. D.
Apache Spark Meetup @ Workday, SF

Details

NOTE: When you arrive, take the elevator to the 17th Floor and check in with the reception. Also, because of security and serving alcohol on premise, you may be asked to show your ID.

Let’s welcome and kick-off New Year with our first Bay Area Apache Meetup for 2017!

Workday (http://www.workday.com) and Databricks (http://databricks.com) present an evening of Apache Spark (http://spark.apache.org) Tech-talks at Workday Office in San Francisco. We will have two technical talks, SWAG, refreshments, etc.

Thanks to our hosts and sponsors Workday Inc (https://www.workday.com/) for sponsoring this meetup.

Agenda:

6:30 - 7: 00 pm Mingling & Refreshments

7:00 - 7:05 pm Introductions

7:05 - 7:45 pm Workday: Tech Talk - 1

7:45 - 7:50 pm break

7:50 - 8:30 pm Databricks: Tech Talk - 2

8:30 - 8:45 pm Mingling

Workday: Tech -Talk-1: Building a modern data discovery and BI platform using Apache Spark and Catalyst

Abstract:

Traditionally, enterprise companies have had to buy and integrate several products ranging from ETL products, to columnar databases, to visualization tools in order to facilitate data analytics for business analysts. This slows down the time to insight. The Workday's analytics group (previously Platfora) has developed a modern data discovery and BI platform that allows business users to work with raw data and use an iterative and interactive approach to drill down into the data.

In this talk we will share how we enhanced Apache Spark by implementing several distributed data processing techniques. We will also share how we have used the enhanced version of Spark in different capacities. These include 1) a modern interactive data preparation engine that provides quick examples over a complex transformation pipeline, 2) an engine that materializes the transformation pipelines into lenses (data marts) that are optimized for in-memory processing, and 3) an analytic query engine that uses the lenses for interactive and ad-hoc data exploration.

Bio: Kevin Beyer is a VP of Engineering for Analytics at Workday and the former CTO of Platfora in San Mateo, CA. Prior to joining Platfora, Kevin was a Research Staff Member at the IBM Almaden Research Center where he investigated systems and languages for databases and big data. Kevin received his Ph.D. from the University of Wisconsin - Madison where he studied database and business intelligence systems.

Databricks: Tech-Talk-2: Spark SQL: A compiler from Queries to RDDS.

Abstract:

Spark SQL, a module for processing structured data in Spark, is one of the fastest SQL on Hadoop systems in the world. This talk will dive into the technical details of Spark SQL spanning the entire lifecycle of a query execution. The audience will walk away with a deeper understanding of how Spark analyzes, optimizes, plans and executes a user's query.

Bio: Sameer Agarwal is a Software Engineer at Databricks working on Spark core and Spark SQL. Previously, he received his PhD in Databases from UC Berkeley AMPLab where he worked on BlinkDB, an approximate query engine for Spark.

NOTE: When you arrive, take the elevator to the 17th Floor and check in with the reception. Also, because of security and serving alcohol on premise, you may be asked to show your ID.

Photo of Bay Area Spark Meetup group
Bay Area Spark Meetup
See more events
Workday Inc
160 Spear St Ste 1750 · San Francisco, CA