Toronto Apache Spark #9


Details
Building Spark Data Pipelines
Story time! Real life war stories of Toronto companies and their experiences while building Spark pipelines.
Hangout on Air link for streaming/recording
https://plus.google.com/u/3/events/ci6aimeuift2rh39u4i7iseujc4?hl=en
Agenda:
6:30PM to 7:00PM - Opening and networking (refreshments provided)
7:00PM to 7:30PM - Building complex, mission-critical workflows without pulling your hair out
7:35PM to 8:20PM - Building a Recommender System Pipeline with Spark
8:30PM to 9:00PM - Networking
------------------------------------------------------------
Title: Building complex, mission-critical workflows without pulling your hair out
Target audience: Data Scientist, Data Engineer, Data Analyst, Dev Ops
Level: Intermediate
Speaker: David Wright (https://www.linkedin.com/in/david-wright-8952036) is a former Java and Rails web developer. For the past two and a half years he's worked as a Data Platform Engineer for Shopify.
Shopify uses pySpark to build massive and complex data pipelines to calculate its critical financial numbers. This talk explores the learnings and methodologies used and developed to create its most difficult and important workflow
----------------------------------------
Title: Building a Recommender System Pipeline with Spark
Target audience: Data Scientist, Data Engineer, Data Analyst
Level: Intermediate
Speaker: Jorge Escobedo (https://www.linkedin.com/in/jescob) is a co-founder and CTO at Canopy Labs.
I will walk you through how we've leveraged Spark at Canopy Labs to build our recommender system pipeline. The talk will cover all steps of the process: ETL, model building and evaluation.
------------------------------------------------------------
Sponsors: Paytm Labs
http://photos2.meetupstatic.com/photos/event/4/9/8/c/600_441018828.jpeg
Organized by: Sean Glover, Mehrdad Pazooki.

Toronto Apache Spark #9