BDNSHH April - Cloudera Impala
Details
Marcel Kornacker, architect of Cloudera Impala will be in Hamburg and is joining us to share about his project. I anticipate the demand for this talk will be fairly high so if someone with a larger capacity room wants to volunteer to host, that would be excellent.
Building a Hadoop Data Warehouse with Impala
Impala ( http://impala.io ) raises the bar for SQL query performance on Apache Hadoop. With Impala, you can query Hadoop data – including SELECT, JOIN, and aggregate functions – in real time to do BI-style analysis. As a result, Impala makes a Hadoop-based enterprise data hub function like an enterprise data warehouse for native Big Data.
In this talk from Impala architect Marcel Kornacker, we will explore:
• How Impala's architecture supports query speed over Hadoop data that not only convincingly exceeds that of Hive, but also that of a proprietary analytic DBMS over its own native columnar format.
• The current state of, and roadmap for, Impala's analytic SQL functionality.
• An example configuration and benchmark suite that demonstrate how Impala offers a high level of performance, functionality, and ability to handle a multi-user workload, while retaining Hadoop’s traditional strengths of flexibility and ease of scaling.
About Marcel:
Marcel Kornacker is a tech lead at Cloudera for new products development and creator of the Cloudera Impala project. Following his graduation in 2000 with a PhD in databases from UC Berkeley, he held engineering positions at several database-related start-up companies. Marcel joined Google in 2003 where he worked on several ads serving and storage infrastructure projects, then became tech lead for the distributed query engine component of Google’s F1 project.
