Advanced Data Pipeline Optimization

Details
Description:
In the dynamic world of data engineering, the ability to optimize and orchestrate complex data workflows is paramount. This session offers a deep dive into advanced Apache Spark capabilities and sophisticated methods for optimizing data pipelines.
Join us for an engaging session that explores the latest advancements in data engineering, providing practical insights to help you tackle complex data challenges with confidence
Agenda:
* Introduction
* Advanced Apache Spark Techniques
* Data Modelling and Transformation Concepts
* Optimizing Data Pipelines
* Discussion
About the Speaker:
Vignesh is a seasoned engineer with extensive experience across diverse business domains such as investment management, credit and enterprise blockchain applications. With a solid background in building large-scale data platforms, Vignesh aims to inspire and educate fellow data enthusiasts on the complexities and innovations within the data engineering field.
Pre-requisites for the talk:
- Basic understanding of Data Engineering concepts: including data pipelines, ETL (Extract, Transform, Load) processes, and data warehousing.
- Basic experience with any data processing framework, such as Apache Spark or Hadoop, and understanding of core concepts like distributed computing, data shuffling, and partitioning.
- Familiarity with relational database concepts like schema design, indexing, and normalization.

Advanced Data Pipeline Optimization