Dec 3, 2012 · 5:30 PM
Interested in learning more about the Impala project and how it enables low latency analytics on Hadoop? We'll be hosting Marcel Kornacker, Cloudera tech lead on the project, who will join us to explain what Impala is and how it works.
Note that this meetup is being held in conjunction with the Chicago Hadoop User Group – please make sure you only RSVP once. Also, pizza and beverages will be provided by Cloudera. Look forward to seeing you there!
The Cloudera Impala project is for the first time making scalable parallel database technology, which is the underpinning of Google's Dremel as well as that of commercial analytic DBMSs, available to the Hadoop community. With Impala, the Hadoop community now has an open-sourced codebase that allows users to issue low-latency queries to data stored in HDFS and Apache HBase using familiar SQL operators.
This talk will start out with an overview of Impala from the user's perspective, followed by a presentation of Impala's architecture and implementation, and will conclude with a comparison of Impala with Apache Hive, commercial MapReduce alternatives, and traditional data warehouse infrastructure.
Tech lead at Cloudera for new products and creator of the Cloudera Impala project. Marcel graduated in 2000 with a PhD in databases from UC Berkeley, followed by engineering jobs at a few database-related startup companies. Marcel joined Google in 2003, where he worked on several ads-serving and storage infrastructure projects. His last engagement was as the tech lead for the distributed query engine component of Google's F1 project.