addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-crosscrosseditemptyheartfacebookfolderfullheartglobegmailgoogleimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartrashtriangle-downtriangle-uptwitteruseryahoo

Bay Area Hadoop User Group (HUG) October Meetup

October 2011 HUG Agenda:

  • 6:00 - 6:30 - Socialize over food and beer(s)
  • 6:30 - 7:00 - Fail-Proofing Hadoop Clusters with Automatic Service Failover
  • 7:00 - 7:30 - Incremental Processing in Hadoop
  • 7:30 - 8:15 - Dodd-Frank Financial Regulations and Hadoop

Fail- Proofing Hadoop Clusters with Automatic Service Failover: With the increase use of Hadoop comes an increase in the need for critical safeguards, especially in the financial industry where any data loss could mean millions in penalties.

What happens when parts of an Hadoop cluster goes down? How do Hadoop based solutions for the financial industry cope with NameNode failures? We will share the failover issues we've encountered and  best practices for performing continuous health monitoring, best suited for the financial industry.

In addition, we will cover ZooKeeper-based failover for NameNode and other related SPOF services (e.g., JobTracker, Oozie, Kerberos).

Presenter: Michael Dalton, Zettaset Inc.

Incremental Processing In Hadoop: A Hadoop cluster offers limited resources in terms of CPU, disk and network bandwidth. Further, a Hadoop cluster is typically shared amongst users and has multiple jobs running concurrently that compete for resources. In such an environment, a Map-Reduce job with input data scaling petabytes, is bound to consume excessive resources, incur long delays, negatively impact other jobs and bring down the cluster throughput.

I present an extension of Map-Reduce execution model (as implemented in Hadoop) that allows incremental processing wherein a job can add input on the fly as and when required. The job may begin as a small job choosing to process a limited subset of data. As data flows through the system, useful statistics become available that help decide the additional input (if any) that needs to be added/processed. Job expansion is governed by user-defined policies that dictate the job's growth as per the available resources on the cluster. I share encouraging results from experimental evaluation under single/multi-user workloads.

Presenter: Raman Grover, PhD Student UC Irvine

Dodd-Frank Financial Regulations and Hadoop: The Dodd-Frank Act signifies the biggest US regulatory change in several decades. According to experts, Dodd-Frank will have a substantial influence over an estimated 8,500 investment managers, all the 10–12 US exchanges and alternative execution networks.

This presentation describes implementation perspective about the new regulations that specifically relate to the central clearing of OTC derivatives, and the repercussions for confirmation / settlement flows, real-time reporting and risk management. A trading platform and repository will need direct access to exchanges. This eliminates layers of risk by removing redundant data keying and duplication. Straight-through processing facilitates integration of front-to-back office systems and has the additional benefit of helping prevent illegal market manipulative practices providing the necessary audit trail. Big Data Analytics with cloud based Hadoop, Hbase, Hive along with BI tools will be necessary for straight-through processing and realtime reporting.

Presenter: Shyam Sarkar, AyushNet and Suvradeep Rudra, AyushNet

Yahoo Campus Map:

Detail map

Location on Wikimapia:[masked]&lon=[masked]&z=18&l=0&m=b&search=yahoo

Join or login to comment.

  • RD C.

    Great Hosts @ Yahoo. Speakers were great and turnout was amazing.

    October 20, 2011

  • Vijay B.

    It was my first meetup and i am sure that i am not going to miss one now.
    I am a student at SJSU and my masters project is targeting to solve the failover of namenode in Hadoop. The first presentation by Zettaset gave us necessary impetus to decide upon our approach. Thanks for arranging such a nice event. Thanks!

    October 20, 2011

  • KC L.

    Excellent topics ! Would there be videos of these great speakers taking their precious time to expand our knowledge on Hadoop ? If there is please send me the links, thank you.

    October 20, 2011

  • Rafael

    Great topics, all well presented. Pizza, beer, and three topics on the same night are a bit of a stretch though.

    October 20, 2011

  • A former member
    A former member

    It was the first time and it really worth it.

    Thanks to Yahoooooooooooooooo!!!!!!!

    October 20, 2011

  • A former member
    A former member

    Great show!

    October 20, 2011

Our Sponsors

  • Yahoo

    Free admission, Space, Pizza and Beer

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy