"To and Fro from redshift: Extending our workflow service for use cases beyond ETL"
Coursera is an online educational startup with over 19 million learners across the globe. At Coursera we use Redshift as our primary data warehouse as it provides a standard SQL interface and has fast and reliable performance. We use our open-source framework Dataduct to move data to and fro from redshift. In this talk we’ll cover the workflow service at Coursera and how it is now being used for other use cases beyond just ETL such as machine learning, predictions and bulk loading into cassandra.
Sourabh Bajaj is a software engineer on the Analytics team at Coursera. There he spends most of his time working on Analytics Infrastructure ranging from warehousing, notifications systems to recommendations. His primary interests are Machine Learning and Distributed Systems. Prior to Coursera he was a graduate student at Georgia Tech.
Birds of a Feather Discussions:
We'll break up in to groups to discuss topics of interest.
Location and Logistics:
The San Francisco AWS Pop-up Loft is located on Market Street in Union Square (925 Market Street (https://www.google.com/maps/dir/925+Market+St,+San+Francisco,+CAfirstname.lastname@example.org,-122.4106147,17z/data=%214m13%211m4%213m3%211s0x80858085c7eba731:0xec444bf35b212e82%212s925+Market+St,+San+Francisco,+CA+94103%213b1%214m7%211m0%211m5%211m1%211s0x80858085c7eba731:0xec444bf35b212e82%212m2%211d-122.408426%212d37.783288)), accessible by all major bus and trolley lines and BART. Parking can be difficult, so we recommend public transit or taking the Fifth & Mission / Yerba Buena Parking Garage (http://www.fifthandmission.com/home.htm). Several nearby hotels also offer paid valet parking.