Introduction to Data Processing with Apache Spark

Hosted By
George A. and Dan S.

Details
In this talk, Dan Swain, a Data Engineer at Pandora, will give an overview of Apache Spark. Spark is an open-source distributed general-purpose cluster-computing framework. One of Spark’s greatest strengths is that it provides a consistent programming framework across many compute platforms and storage mechanisms. Spark code looks the same whether you’re running it on Hadoop, Kubernetes, AWS, or your laptop.
This talk will include examples in both Scala and Python (PySpark).

RocDev
See more events
Enel X
400 Meridian Centre Blvd #220 · Rochester, NY
Introduction to Data Processing with Apache Spark