Hulu viewers generate a tremendous amount of data: our users watch over 400 million videos and 2 billion advertisements a month. Processing and analyzing that data is critical to our business, whether it is for deciding what content to invest in in the future, or to convince external partners on the superiority of Hulu as an advertising platform. In this presentation we will provide an overview of our entire data platform, from collecting and storing the raw event data, to transforming it into a relational structure and performing analysis. We will describe how and for what purpose we use various technologies in the Hadoop ecosystem such as MapReduce, HBase and Hive. The key focus in the talk will be to describe how data flows through our pipeline, and how we have built a powerful toolchain, both on top of, and around Hadoop, to suit our business needs. We will also compare and contrast our methods with those we have seen adopted by other companies seeking to perform similar tasks
Prasan Samtani is a software developer at Hulu working on the data platform team, which focuses on building components on top of Hadoop to enable data ingestion, processing, job scheduling, and preparation for analysis. Previously, he was at Alelo, a company designing virtual humans for language and culture training. His interests are in distributed systems, high level languages, and artificial intelligence, and he is a computer history aficionado.
Tristan Reid is a senior software developer at Hulu leading the metrics and reporting tools (MART) team, which focuses on building a toolchain around our data platform to enable easy reporting, ingestion and monitoring. Prior to Hulu, Tristan was VP of Solution Design at Ares Mgmt, leading a team building research tools for investment professionals. He has taught software development courses for IBM, BEA Systems and others in the US, Europe and Asia. Previously, Tristan built risk management and data analysis tools at Capital Research as a Quant Research Associate. Before his career in finance, he participated in a number of start-ups, both as a resource and as principal.