Skip to content

Details

Agenda

18:30 - Welcome and introduction

18:45 - Using execution plans to write efficient Spark Code by Michael Johnson

Using execution plans to write efficient Spark code.
When you first begin tuning Spark performance, it can sometimes feel like a mix of trial and error, or simply scaling up resources until the job completes in the expected timeframe. This often leads to frustration and wasted resources.
This session begins with a brief review of the Spark job execution lifecycle, followed by an exploration of how to use execution plans in Apache Spark to systematically optimise performance. Topics include:
* Inefficient join strategies
* Identifying unnecessary shuffles
* Partitioning and data skew issues
* The impact of Catalyst and Adaptive Query Execution
You will leave this session with a practical approach to identifying and fixing performance issues in your Spark workloads.

20:00 - Closing

Related topics

Events in Sandton
Microsoft Azure
SQL
Computer Programming
Open Source
Software Development

Sponsors

Cobalt Analytics

Cobalt Analytics

Cobalt Analytics have donated prizes to be given to a lucky attendee

Microsoft

Microsoft

Microsoft has sponsored the venue and snacks.

CyberLogic

CyberLogic

Cyberlogistics has come onboard as a sponsor of this event

You may also like