Skip to content

Doug Cutting on Avro and Todd Lipcon on HBase and Accumulo

Photo of Ed Kohlwey
Hosted By
Ed K.
Doug Cutting on Avro and Todd Lipcon on HBase and Accumulo

Details

Avro: a data format for Big Data

Doug Cutting, Cloudera

The Hadoop-based big data ecosystem is composed of many independent open source projects. Most folks use Hadoop as their big data kernel plus several other components on top, such as HBase, Pig, Hive, Flume or Sqoop.

Avro provides a common data format to maximize interoperability between these. It is efficient and supports both versioning and complex, dynamic data-types.

A Comparison of HBase and Accumulo

Todd Lipcon, Cloudera

On the surface, Apache HBase and Accumulo are both mature implementations of Google's BigTable distributed storage system; however, the two systems have been implemented by independent developer communities and with very different use cases in mind. This resulted in certain differences, both in the underlying implementation as well as the feature sets provided to their users. This presentation will explore some of the similarities and differences, and summarize key areas in which each project could learn from and benefit the other.

Agenda:

6:00-6:30 - Networking, snacks, refreshments
6:30-7:30 - First presentation
7:30-7:40 - Break and next presentation setup
7:40-8:10 - Second presentation
8:10-8:40 - Moderated discussion on BigData technologies

Photo of Hadoop-DC group
Hadoop-DC
See more events
National Press Club
529 14th ST NW · Washington, DC