Hive is a scalable data warehouse infrastructure built on top of Hadoop. It provides tools to enable easy data ETL, a mechanism to put structures on the data, and the capability to querying and analysis of large data sets stored in Hadoop files. Hive defines a simple SQL-like query language, called HiveQL, that enables users familiar with SQL to query the data. At the same time, this language also allows programmers who are familiar with the MapReduce fromwork to be able to plug in their custom mappers and reducers to perform more sophisticated analysis that may not be supported by the built-in capabilities of the language. The biggest Hive deployment to date is the silver cluster at Facebook Inc, which consists of 1100 nodes with 8 CPU cores and 12 1TB-disk each. This turns into a cluster of 8800 CPU cores and 13PB of raw storage. Hive does not mandate read or written data be in the "Hive format"---there is no such thing. Hive works equally well on Thrift, control delimited, or your specialized data formats. Please see File Format and SerDe in Developer Guide for details. http://wiki.apache.org/hadoop/Hive

Join us and be the first to know when new Meetups are scheduled
Log in with Facebook to find out
By creating a Meetup account, you agree to the Terms of Service

Welcome!

  • There are no upcoming Meetups

    But there is 1 suggested Meetup!

    Check it out

Recent Meetups

What's new

 
Founded Feb 28, 2010

Help support your Meetup

Chip in

People in this
Meetup are also in:

Sometimes the best Meetup Group is the one you start

Get started Learn more
Bill

I started the group because there wasn't any other type of group like this. I've met some great folks in the group who have become close friends and have also met some amazing business owners.

Bill, started New York City Gay Craft Beer Lovers

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy