March Presentation Night


Details
Thank you to IBM for hosting our March event!
Agenda
- 6:00 - 6:30: Food + mingling
- 6:30 - 6:40: Opening remarks
- 6:40 - 7:25: Spark for Beginners
- 7:25 - 8:10: Spark Records for Bulletproof Jobs
Feature Talks
- Spark for Beginners - Joseph Kambourakis (IBM)
Intended audience: People new to Spark
Abstract: This will introduce spark to people who haven't used it before. It will introduce the history, basic structures such as RDDs and DataFrames, have a few coding examples, and give some resources about how to learn more.
- Spark Records for Bulletproof Jobs - Simeon Simeonov (Swoop)
Intended audience: Prototyping with Spark or running it in production
Abstract: Big data never stops and neither should your Spark jobs. They should not stop when they see invalid input data. They should not stop when there are bugs in your code. They should not stop because of I/O-related problems. Bulletproof jobs not only keep working but they make it easy to identify and address the common problems encountered in large-scale production Spark processing: from data quality to code quality to operational issues. In this talk you will learn how to use Spark Records--a data pattern and associated open-source tooling--for bulletproofing your Spark jobs and doing 10-100x faster root cause analysis.

March Presentation Night