Skip to content

Details

Main Presentation:
In today’s data-driven world, organizations must handle vast and varied data sources while delivering timely insights. This session explores how to design and implement scalable, resilient data pipelines on AWS—from raw data ingestion to delivering business-ready insights. We’ll walk through real-world architecture patterns using AWS-native services such as Kinesis, Glue, EMR, Athena, and Redshift, with a focus on modular design, data governance, cost optimization, and performance.

Attendees will learn how to handle both batch and streaming ingestion, orchestrate complex workflows, and manage data across different lifecycle stages (raw, refined, curated) using scalable storage solutions like Amazon S3. We'll also dive into transformation strategies using PySpark and SQL, techniques for metadata management and schema evolution, and tools for observability and access control.

Whether you're building your first pipeline or optimizing existing systems, this talk will equip you with practical strategies, architecture blueprints, and lessons learned from real-world implementations. By the end of the session, you’ll understand how to create data pipelines that are robust, cost-effective, and ready to scale—empowering your teams to move faster from data to decisions in the AWS ecosystem.

Lightning Talks:
J Coleman presents "LabCOAT Begins"

Events in Chicago, IL
Amazon Web Services
Cloud Computing
Big Data
Data Analytics

AI summary

By Meetup

Monthly meeting for members; attendees will receive the upcoming topic and speaker details.

Members are also interested in