(Online) From Raw to Refined: Building Production Data Pipelines That Scale
Details
This is an Online event, the Teams link will be published on the right of this page for those who have registered.
18:30: From Raw to Refined: Building Production Data Pipelines That Scale - Pradeep Kalluri
19:55 Prize Draw - Packt eBooks
Session details:
From Raw to Refined: Building Production Data Pipelines That Scale - Pradeep Kalluri
Every organization needs to move data from source systems to analytics platforms, but most teams struggle with reliability at scale. In this talk, I'll share the three-zone architecture pattern I use to build production data pipelines that process terabytes daily while maintaining data quality and operational simplicity.
You'll learn:
- Why the traditional "single pipeline" approach breaks at scale
- How to structure pipelines using Raw, Curated, and Refined zones
- Practical patterns for handling batch and streaming data with Kafka and Spark
- Real incidents and lessons learned from production systems
- Tools and technologies that work (PySpark, Airflow, Snowflake)
This isn't theory—it's battle-tested patterns from years of building data platforms. Whether you're designing your first data pipeline or scaling an existing platform, you'll walk away with actionable techniques you can apply immediately.
Speaker:
Pradeep Kalluri
Data Engineer | NatWest | Building Scalable Data Platforms
Data Engineer with 3+ years of experience building production data platforms at NatWest, Accenture, and Capgemini. Specialized in cloud-native architectures, real-time processing with Kafka and Spark, and data quality frameworks. Published technical writer on Medium, sharing practical lessons from production systems. Passionate about making data platforms reliable and trustworthy.


