Location visible to members
This HUG will be about the upcoming versions, how to insert 1 million rows per second, HBase benchmarking, and how HBase is used in a hot new startup. Each talk will be 15-20 minutes long.
1) HBase[masked] and[masked] by Jean-Daniel Cryans, StumbleUpon
Better response time to admin commands.
New EC2 scripts.
New contribution for indexed HBase.
Better overall reliability.
Integration with the fully functional HDFS appends (no data loss).
Updated Thrift API.
Multi-datacenter master/slave replication
Master re-architecture (depending on hadoop's date of release)
2) How to get 1M/s by Ryan Rawson, StumbleUpon
How do you import 12 billion rows in < 48 hours? With a peak of 1 million rows/second inserted?
Learn about the newest code additions pioneered at SU, and some practical tips for high volume imports using the HTable API (No map reduces and ruby scripts!).
3) Experiences Benchmarking HBase for Serving Workloads by Adam Silberstein, Yahoo!
I will discuss my group's experience building YCSB, the Yahoo! Cloud Serving Benchmark. Our goal is to provide a tool for evaluating different systems with the same serving workloads, enabling apples-to-apples comparisons. I will discuss my experiences tuning and running experiments on HBase, as well as show results comparing HBase to other systems.
4) HBase inside the Search Ecosystem by Bradford Stephens, Drawn to Scale
My talk will be about the importance of HBase in the real-time Search
ecosystem that Drawn to Scale built. HBase is an essential component
of our architecture, as the central data repository that power Search.
The talk will be about how HBase is used in this context and the
benefits we get from it.
We're also interested in hearing from the community, if you have a project that uses HBase that you would like to present, please contact the organizers.
This meetup is sponsored by StumbleUpon, snacks and soft drinks will be served.