addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrosseditemptyheartfacebookfolderfullheartglobegmailgoogleimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

40th Bay Area Hadoop User Group (HUG) Monthly Meetup


  • 6:00 - 6:30 - Socialize over food and beer(s)
  • 6:30 - 7:00 - Apache HBase 0.96 : An Overview of What's New
  • 7:00 - 7:30 - Apache Oozie 4.x: An Overview of What's New
  • 7:30 - 8:00 - Achieve Real Time Hadoop Performance with In-Memory Acceleration


Session I (6:30 - 7:00 PM) - Apache HBase 0.96: An Overview of What's New

The next major version of Apache HBase that will have several new features. The "Singularity", because you will have to start and stop your cluster to upgrade to 0.96. 0.96 requires Apache Hadoop 1.0.0 at least, and supported on Hadoop 2.0.0 as well. 0.96 uses protobufs all the time. All of its serializations to ZooKeeper, to the filesystem, and over rpc are protobufs. It runs on JDK7. Metrics have been edited and converted to use Hadoop Metrics2. It has HBase Snapshots and PrefixTreeCompression, etc. Stack will provide a high-level overview of what's new in HBase 0.96.

Speaker: Michael Stack, Apache HBase PMC Chair, Apache Hadoop PMC, and Software Engineer, Cloudera


Michael is Chair of the Apache HBase Project Management Committee and a member of the Apache Hadoop Project Management Committee. His first exposure to big data happened over ten years ago while working on web crawlers and large-scale search at the Internet
Archive. Michael is a software engineer on the storage team at Cloudera in San Francisco where he spends most of his time working on Apache HBase.

Session II (7:00 - 7:30 PM) - Apache Oozie 4.x: An Overview of What's New

Apache Oozie has come a long way and now accounts for over 2.8 Million jobs per month on Yahoo's grid infrastructure. If you are running Hadoop jobs repeatedly and thinking of a smarter way of doing it, Apache Oozie is the answer. Be it running complex data transformation jobs chained one after another or simple daily data copy, Oozie workflows will help you to manage these tasks efficiently. Mona will cover the new features introduced in Apache Oozie 4.x, in particular, Apache HCatalog Integration, Job Notifications and SLA Monitoring for building large-scale and efficient data processing pipelines.

Speaker: Mona Chitnis, Apache Oozie PMC and Committer, and Software Engineer, Yahoo


Mona is an Oozie Committer and PMC member at the Apache Software Foundation, and a distributed systems engineer in the Hadoop team at Yahoo where she focuses on enabling various Yahoo businesses build complex workflows on top of Apache Oozie. Mona holds an MS in Computer Science from Georgia Institute of Technology.


Session III (7:30 - 8:00 PM) - Achieve Real Time Hadoop Performance with In-Memory Acceleration

As Apache Hadoop adoption continues to advance, customers are depending more and more on Hadoop for critical tasks, and deploying Hadoop for use cases with more real-time requirements. In this session, we will discuss the desired performance characteristics of such a deployment and the corresponding challenges. Leave with an understanding of how performance-sensitive deployments can be accelerated using In-Memory technologies that merge the Big Data capabilities of Hadoop with the unmatched performance of In-Memory data management.

Speaker: Nikita Ivanov, Founder & CEO, GridGain Systems


Nikita Ivanov is founder and CEO of GridGain Systems, started in 2007 and funded by RTP Ventures and Almaz Capital. Nikita has led GridGain to develop advanced and distributed in-memory data processing technologies – the top Java in-memory computing platform starting every 10 seconds around the world today.

Nikita has over 20 years of experience in software application development, building HPC and middleware platforms, contributing to the efforts of other startups and notable companies including Adaptec, Visa and BEA Systems. Nikita was one of the pioneers in using Java technology for server side middleware development while working for one of Europe’s largest system integrators in 1996.

He is an active member of Java middleware community, contributor to the Java specification, and holds a Master’s degree in Electro Mechanical.


Yahoo Campus Map:

Detail map


Location on Wikimapia:[masked]&lon=[masked]&z=18&l=0&m=b&search=yahoo


Join or login to comment.

  • Robert K.

    Was this recorded? If so, where can I find the video?

    October 29, 2013

  • Jacky L.

    I feel like the presenters are presenting "Release Notes"

    October 16, 2013

  • Alina G.

    Hey guys! We are looking for a Data Scientist for a client in San Francisco. Preferably a PhD with Hadoop and big data experience. Mast have advanced SQL skills. It's a permanent position for a client in Downtown SF with a salary range of $160K-$180K and a sign on bonus. Email me at [masked] if interested.

    October 15, 2013

  • A former member
    A former member


    October 13, 2013

  • Deepan C.

    When it wil happen in Bangalroe-India i attended one in 2011 in IISC Bangalore interested to attend

    September 15, 2013

    • Linnsey N.

      You should start one up.

      1 · September 18, 2013

  • Gerald W. facilitates working with "big data"

    April 11, 2013

  • A former member
    A former member

    10/16/2013 ???

    November 15, 2012

Our Sponsors

  • Yahoo

    Free admission, Space, Pizza and Beer

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy