June 13, 2013 · 6:45 PM
In this double header we present a practitioners close view of the science and an engineer's close view of design and implementation of distributed algorithm.
Day in the Life of a Data Scientist - Chris Pouliot
In this session, Netflix analytical leader Chris Pouliot shares his experience building a large team of data scientists at Netflix and what a typical day in life of a Data Scientist looks like. From extracting and exploring data to posing good questions around them and matching them with the right algorithms, Chris goes through the lifecycle of data science in practice.
Chris built a central, horizontal team for the company that spans across all business verticals. Chris shares insights and stories, covering pitfalls and successes and impact they have at Netflix.
Distributed Generalized Linear Modeling (GLM)
In this session, 0xdata engineers, Tomas Nykodym & Cliff Clickexplain how to build a Distributed GLM (Logistic, Poisson Regression.)
http://en.wikipedia.org/wiki/Generalized_linear_model is the most popular tool at the hand of a good datascientist. A couple of very powerful mathematical approaches such as Stephen Boyd's ADMM and Generalized Gradients are analyzed along with implementation choices. Live Demo and performance comparisons between the two approaches and on applications on Big Data will be presented. https://twitter.com/hexadata