Toronto Apache Spark 2.0 (TAS 2.0)


Details
RESCHEDULED
Toronto Apache Spark 2.0 (TAS 2.0) will be having our FIRST Meetup of 2019!!
Topic: Lessons Learned: Building high volume, reliable datalake based on Apache Spark.
Summary: Paytm’s business generates a multitude of raw data, storing it in a variety of sources, such as RDBS - MySQL, Messaging queues - Kafka, SaaS apps, NoSQL, Object storages, etc., Ingesting many million records daily into the datalake (for business reporting, adhoc query, analytics & ML apps) with full confidence on freshness (timely data), completeness (data quality), schema evolution (structure changes) SLO guarantees present unique challenges at scale. This talk will explore these challenges and lessons learned using Apache Spark as Paytm's data processing engine in their datalake.
____________
Schedule:
6:00pm - Check-in, Socialize & Eat Pizza
6:15pm - Talk #1
6:45pm - Q&A
7:00pm - Break + More Pizza/Socialization
7:15pm - Talk #2
7:45pm - Q&A
8:00pm - Meetup Conclusion
____________

Toronto Apache Spark 2.0 (TAS 2.0)