Detailed agenda and summaries to follow. General agenda:
- 6:00 - 6:30 - Socialize over food and beer(s)
- 6:30 - 7:00 - The Changing Big Data Landscape - empowering the business user with analytics-driven insight
- 7:00 - 7:30 - Oozie: Towards a Scalable Workflow Scheduling System for Hadoop
The Changing Big Data Landscape - empowering the business user with analytics-driven insight
The exponential growth of structured and unstructured data has overwhelmed traditional BI solutions. Data analysts, managers and executives want to be able to easily correlate the new unstructured data with legacy data sitting on tape or in platters to gain complete insights into customer behavior, business and IT operations without having to worry about the economics.
This session will discuss:
- the evolution of Big Data and the challenges it presents to business users
- the role Hadoop and NoSQL technologies play today
- the challenges that result from the growth of this data and possible solutions.
Presenter: Matthew Schumpert, Datameer
Oozie: Towards a Scalable Workflow Scheduling System for Hadoop
During the past three years Oozie has become the de-facto workflow scheduling system for Hadoop. Oozie has proven itself as a scalable, secure and multi-tenant service. Oozie stably processes more than 45% of the jobs run across more than 25 Hadoop clusters in Yahoo. At the same time adoption in other enterprises has increased substantially since Oozie was contributed to the Apache community. We attribute these achievements to design decisions that was selected to be presented at a workshop during the ACM/SIGMOD conference. This presentation covers the key architectural design choices described in the paper. Operational metrics will be used to illustrate production experience at Yahoo, and we will also include a quick tutorial.
Presenter: Mohammad Islam and Virag Kothari, Yahoo!
Yahoo Campus Map:
Location on Wikimapia: