Skip to content

Technical workshop on Spark SQL

Photo of Elizabeth Land
Hosted By
Elizabeth L.
Technical workshop on Spark SQL

Details

Learn the basics of Spark SQL, the most popular component of Apache Spark. Briefing and hands-on training on topics including: A brief overview of Spark, Introduction to the Dataframe API and extraction of data using SQL, Additional Dataframe functions and reading data from different sources.

NOTE: Attendees need to bring their own laptop for the exercises

Agenda
Coffee (15 mins)
Welcome – Women in Big Data, SAP Human Resources (15 min.)
Spark Overview (15 min.)
Spark SQL and the DataFrame API (20 min. Lecture, 10 min set up and 20 min. Exercise)
Break (10 min.)
Additional DataFrame functions (10 min. Lecture and 10 min. Exercise)
Spark SQL with different Data sources (15 min. Lecture, 15 min. Exercise)
Questions (15 min.)
Lunch (30+ mins)

Pre-requisite

Basic familiarity with Spark and SQL syntax

Basic Familiarity with Scala or Java
Sign up to Databricks community notebook which will be used for hands on training (https://accounts.cloud.databricks.com/registration.html#signup/community)

Instructor bios

Xinh Huynh is a senior software engineer, with over ten years experience developing analytics and data pipelines at scale. She most recently worked in the analytics team at Samsung SDS America, applying Spark and Scala for data munging, exploration, and data pipelines.

Gayathri is a software engineer at Intel with years of experience in Application development, technical consulting, developer advocacy and performance tuning. She is an active contributor to the open source Apache Spark project especially to the Machine Learning Library.

Photo of Women in Big Data Meetup group
Women in Big Data Meetup
See more events
SAP
3412 Hillview Ave, Palo Alto · Palo Alto, CA