Cloudera Impala: An Open Source Real-Time Query Engine for Apache Hadoop


Details
We are very excited to announce that Greg Rahn from Cloudera is coming out to talk about Impala, a real-time query system for Hadoo (http://blog.cloudera.com/blog/2012/10/cloudera-impala-real-time-queries-in-apache-hadoop-for-real/)p. Impala is similar to Drill ( which Ted Dunning from MapR presented about in February (https://www.meetup.com/Boulder-Denver-Big-Data/events/97573442/) ). Come learn about what Cloudera is doing to integrate Reall-time queries into their Hadoop distribution!
Agenda
6:00 – 6:30 - Socialize over food and drink
6:30 – 6:45 - Announcements, Upcoming Events
6:45 – 8:30 - Cloudera Impala: An Open Source Real-Time Query Engine for Apache Hadoop - Greg Rahn
8:30 – ??? - Continued socializing
About the presenter
Greg Rahn - Solutions Architect, Cloudera
Greg Rahn is a Solutions Architect in the Partner Engineering group at Cloudera. He focuses on helping Cloudera's hardware partners optimize their platforms for Hadoop. In addition, he works on performance engineering and benchmarking for the Impala project. Before joining Cloudera, Greg worked for eight years as a database performance engineer at Oracle in the esteemed Real-World Performance Group in the Server Technologies organization.
About the presentation
Cloudera Impala: An Open Source Real-Time Query Engine for Apache Hadoop
The Cloudera Impala project is for the first time making scalable parallel database technology, which is the underpinning of Google's Dremel as well as that of commercial analytic DBMSs, available to the Hadoop community. With Impala, the Hadoop community now has an open source codebase that allows users to issue low-latency queries to data stored in HDFS and Apache HBase using familiar SQL operators.
This talk will start out with an overview of Impala from the user's perspective, followed by a presentation of Impala's architecture and implementation, and will conclude with a comparison of Impala with Apache Hive, commercial MapReduce alternatives, and traditional data warehouse infrastructure.
Cloudera Impala download, documentation and VM: https://ccp.cloudera.com/display/SUPPORT/Downloads
Cloudera Impala source code: https://github.com/cloudera/impala

Cloudera Impala: An Open Source Real-Time Query Engine for Apache Hadoop