Skip to content

Details

Please join us šŸ¤ to learn more about Apache Sparkā„¢, Spark Connect, and Spark ML at NVIDIA.

šŸ“… Date: October 29, 2025
ā° Time: 9:30 AM - 10:30 AM PST (45min talk, then Q&A)
šŸ“ Location: online (live streaming to LinkedIn, X & YouTube)

Agenda:

  • Welcome and Introductions
  • Talk: GPU Accelerated Apache Sparkā„¢ Connect: NVIDIA Accelerator for Spark SQL and MLlib
  • Q&A

Talk: GPU Accelerated Apache Sparkā„¢ Connect: NVIDIA Accelerator for Spark SQL and MLlib

Abstract:
Spark Connect, first included in Apache Sparkā„¢ 3.4 and recently extended to MLlib in Spark 4.0+, introduced a new way to run Spark applications over a gRPC protocol. This has many benefits, including easier adoption for non-JVM clients, version independence from applications, and increased stability and security of the associated Spark clusters.

In this talk, we shall demonstrate how the recent Spark Connect extension for ML, together with Spark SQL’s existing plugin interface, can be used with NVIDIA GPU-accelerated open source plugins for ML and SQL to enable no-code change, end-to-end GPU acceleration of Spark applications over Spark Connect, with performance up to 9x at 80% cost reduction.

We will introduce a working pattern for Spark Connect with accelerated ETL and ML for use in lakehouses. We will discuss how such an architecture can be used in practice and provide a few industry use cases.

Apache Spark
Open Source
Nvidia

Members are also interested in