Skip to content

Pig with Lipstick (from Netflix)

Pig with Lipstick  (from Netflix)

Details

We have Jeff Magnusson from Netflix talking about their latest open source project called Lipstick.

The event is generously hosted by Samsung R&D (http://www.sisa.samsung.com/).
(this is a FREE event)

Agenda:
6pm - 630pm : networking, food
630pm - 730pm : talk
730pm - 8pm : Q&A and networking

Abstract:
Netflix uses Apache Pig to express many complex data manipulation and analytics workflows. While Pig provides a great level of abstraction between MapReduce and dataflow logic, once scripts reach a sufficient level of complexity, it becomes very difficult to understand how data is being transformed and manipulated across MapReduce jobs. To solve this problem, we created (and recently open sourced) a tool named Lipstick that visualizes and monitors the progress and performance of Pig scripts. We'll discuss the architecture, implementation, and future of Lipstick, as well as various use cases around Netflix (e.g. examples of using Lipstick to improve speed of development and efficiency of resulting scripts).

For some further reading about Lipstick, check out the Netflix techblog article here: http://techblog.netflix.com/2013/06/introducing-lipstick-on-apache-pig.html or clone it on github and play with it yourself: https://github.com/Netflix/Lipstick (the quick start guide in the github wiki takes about 5 minutes to set up and will allow you to visualize scripts running in Pig's local mode).

About Jeff:
Jeff manages the Data Platform Architecture group at Netflix where he is helping to build a service oriented architecture that enables easy access to large scale cloud based analytical processing and analysis of data across the organization. Prior to Netflix, he received his PhD from the University of Florida focusing on database system implementation.

Photo of ODSC Santa Clara Data Science group
ODSC Santa Clara Data Science
See more events