Toronto Apache Spark #14


Details
** RSVP is Closed!! (If you haven't received a confirmation email by now please be aware that we won't be able to accommodate you in this event.)
** There will be no live broadcasting for this event
** We have two talks at this event
-----------------------------------------------------------
Agenda:
6:30PM to 7:00PM - Opening and networking (Refreshments provided)
7:00PM to 7:45PM - Spark in Production Pipelines
by Zeev Lieber, Principal Engineer at Amazon
7:50PM to 8:10PM – Presentation by Sebastian Kun, Sr. Software Development Engineer at Amazon
8:15PM to 8:30PM - Meetup Survey results announcement
8:00PM to 9:00PM - Networking
------------------------------------------------------------
Title: Spark in Production Pipelines by Amazon
Description:
This talk is about high level learnings from adopting Spark in Amazon Supply Chain Forecasting pipelines. We will talk about technical and architectural challenges of having your Spark-based pipeline work consistently and reliably. We will touch on topics such as store / compute separation, logical vs physical datasets, materialization and big metadata.
Target audience: Data Scientist, Data Engineer, Data Analyst, Dev Ops
Level: Intermediate to Advanced
Speaker: Zeev Lieber (https://www.linkedin.com/in/zlieber) is a Principal Engineer at Amazon, working in the Supply Chain Optimization Technologies organization. He is helping build the next generation of big data tools to solve some of the most difficult machine learning problems for Amazon Forecasting. Zeev joins Amazon from Google, where he worked on Google Fiber software stack as well as Chromium.
-------------------------------------------
Title: Shoehorning Spark: Dragging a legacy workflow system into the 21st century
Description: Not everyone has the luxury of a greenfield project to make use of Spark. Presentation
would talk about how Amazon SCOT team took a large legacy workflow system with thousands of single-threaded jobs, and gradually incorporated Spark into it without requiring a big rewrite or disrupting any
existing customers. It ended up not just speeding up job execution times, but increasing developer
productivity and making analysts happier too.
Target audience: Software Development Engineer, Data Scientist, Data Engineer, Data Analyst, Dev Ops
Level: Intermediate to Advance
Speaker: Sebastian Kun (https://ca.linkedin.com/in/sebastiankun)
----------------------------------------------------------
Special Thanks to Vitalii and Amandeep at Amazon for coordination.
Sponsors
http://photos2.meetupstatic.com/photos/event/9/0/c/7/600_455077063.jpeg

Toronto Apache Spark #14