- 6:00 - 6:30 - Socialize over food and beer(s)
- 6:30 - 7:00 - Apache HBase 0.96 : An Overview of What's New
- 7:00 - 7:30 - Apache Oozie 4.x: An Overview of What's New
- 7:30 - 8:00 - Achieve Real Time Hadoop Performance with In-Memory Acceleration
Session I (6:30 - 7:00 PM) - Apache HBase 0.96: An Overview of What's New
The next major version of Apache HBase that will have several new features. The "Singularity", because you will have to start and stop your cluster to upgrade to 0.96. 0.96 requires Apache Hadoop 1.0.0 at least, and supported on Hadoop 2.0.0 as well. 0.96 uses protobufs all the time. All of its serializations to ZooKeeper, to the filesystem, and over rpc are protobufs. It runs on JDK7. Metrics have been edited and converted to use Hadoop Metrics2. It has HBase Snapshots and PrefixTreeCompression, etc. Stack will provide a high-level overview of what's new in HBase 0.96.
Speaker: Michael Stack, Apache HBase PMC Chair, Apache Hadoop PMC, and Software Engineer, Cloudera
Michael is Chair of the Apache HBase Project Management Committee and a member of the Apache Hadoop Project Management Committee. His first exposure to big data happened over ten years ago while working on web crawlers and large-scale search at the Internet
Archive. Michael is a software engineer on the storage team at Cloudera in San Francisco where he spends most of his time working on Apache HBase.
Session II (7:00 - 7:30 PM) - Apache Oozie 4.x: An Overview of What's New
Apache Oozie has come a long way and now accounts for over 2.8 Million jobs per month on Yahoo's grid infrastructure. If you are running Hadoop jobs repeatedly and thinking of a smarter way of doing it, Apache Oozie is the answer. Be it running complex data transformation jobs chained one after another or simple daily data copy, Oozie workflows will help you to manage these tasks efficiently. Mona will cover the new features introduced in Apache Oozie 4.x, in particular, Apache HCatalog Integration, Job Notifications and SLA Monitoring for building large-scale and efficient data processing pipelines.
Speaker: Mona Chitnis, Apache Oozie PMC and Committer, and Software Engineer, Yahoo
Mona is an Oozie Committer and PMC member at the Apache Software Foundation, and a distributed systems engineer in the Hadoop team at Yahoo where she focuses on enabling various Yahoo businesses build complex workflows on top of Apache Oozie. Mona holds an MS in Computer Science from Georgia Institute of Technology.
Session III (7:30 - 8:00 PM) - Achieve Real Time Hadoop Performance with In-Memory Acceleration
As Apache Hadoop adoption continues to advance, customers are depending more and more on Hadoop for critical tasks, and deploying Hadoop for use cases with more real-time requirements. In this session, we will discuss the desired performance characteristics of such a deployment and the corresponding challenges. Leave with an understanding of how performance-sensitive deployments can be accelerated using In-Memory technologies that merge the Big Data capabilities of Hadoop with the unmatched performance of In-Memory data management.
Speaker: Nikita Ivanov, Founder & CEO, GridGain Systems
Nikita Ivanov is founder and CEO of GridGain Systems, started in 2007 and funded by RTP Ventures and Almaz Capital. Nikita has led GridGain to develop advanced and distributed in-memory data processing technologies – the top Java in-memory computing platform starting every 10 seconds around the world today.
Nikita has over 20 years of experience in software application development, building HPC and middleware platforms, contributing to the efforts of other startups and notable companies including Adaptec, Visa and BEA Systems. Nikita was one of the pioneers in using Java technology for server side middleware development while working for one of Europe’s largest system integrators in 1996.
He is an active member of Java middleware community, contributor to the Java specification, and holds a Master’s degree in Electro Mechanical.
Yahoo Campus Map:
Location on Wikimapia: