Jan 16, 2013 · 6:30 PM
This location is shown only to members
In this session we'll be discussing real-time queries in Hadoop using Cloudera Impala
Cloudera Impala is an open-source, distributed query execution engine that runs against data stored natively in Apache HDFS and Apache HBase.
You can download Cloudera Impala open source code from Github
You can read Cloudera Impala documentation here
Some of the topics to be discussed:
How Impala can query data stored in HDFS or Apache HBase, in real time using common SQL statements
Compare Impala use cases against Apache Hive, traditional Data Warehouse approaches, and MapReduce options
6.30pm Networking + Free Pizza + Free Beer
7pm talks start
"Dive into Cloudera Impala" by Henry Robinson. Henry is a senior engineer at Cloudera where he works on a variety of distributed systems, including most recently Cloudera Impala. Prior to joining Cloudera in 2009, he was a graduate student at Cambridge University, and before that a strategist at Goldman Sachs.
Announcements and beer break
Q&A and more networking
9.30-ish Session ends