Apache Spark on Kubernetes: What Works, What Breaks, and How to Fix It

Name: Apache Spark on Kubernetes: What Works, What Breaks, and How to Fix It
Start: 2026-03-25T16:00:00+05:30
End: 2026-03-25T18:30:00+05:30
Location: CoWrks, RMZ EcoWorld

Hosted by Onehouse

Onehouse Bengaluru Data Meetup Group

Details

Apache Spark on Kubernetes: What Works, What Breaks, and How to Fix It
(Practical lessons from running and benchmarking real Spark lakehouse workloads)

-----------------------------------------------------------------------------------
Where is the event? COWRKS Ecoworld 4D, 10th Floor, Building 4D, ECOWORLD, Outer Ring Rd, Devarabisanahalli, Bellandur, Bengaluru, Karnataka 560103
Map: https://share.google/1za6kRsnu6SbStkha

How to Register? https://docs.google.com/forms/d/1tQECsMnaWWFCYsznH611SkpOezTQHdEY-4KqMBKkJDA/
-----------------------------------------------------------------------------------

Overview:
Apache Spark and Kubernetes are increasingly becoming the foundation of modern cloud-native data platforms. While Kubernetes makes it easier to deploy and scale Spark clusters, running Spark workloads efficiently in this environment still requires careful tuning, observability, and architectural decisions.

In this open learning session, engineers from Onehouse will share practical lessons from working with some of the largest data lake deployments built on Apache Spark and open table formats over the past several years. Through this experience, the team has worked closely with large-scale lakehouse workloads and Spark pipelines across a variety of production environments.

We’ll explore topics such as:

How Kubernetes changes the way Spark clusters are deployed, scaled, and managed
Techniques to improve Spark SQL performance and query execution
Optimizing Spark reads and writes across open table formats like Apache Hudi, Apache Iceberg, and Delta
Identifying compute waste and storage bottlenecks in Spark workloads
Lessons learned from benchmarking and analyzing large-scale Spark workloads

We’ll also walk through examples of how Spark job analysis using tools like the Spark History Server can help surface performance issues and generate actionable optimization insights.

Expect architecture discussions, real-world performance benchmarks, and practical demos, along with an open discussion on how to run Spark workloads more effectively in modern cloud environments.

Onehouse Bengaluru Data Meetup Group

Apache Spark on Kubernetes: What Works, What Breaks, and How to Fix It

Onehouse Bengaluru Data Meetup Group

Details

Related topics

You may also like