Apache Drill - a Schema-free SQL Query Engine for Hadoop and NoSQL


Details
SQL is one of the most widely used languages to access, analyze, and manipulate structured data. As Hadoop gains traction within enterprise data architectures across industries, the need for SQL for both structured and loosely-structured data on Hadoop is growing rapidly.
Apache Drill (http://drill.apache.org/) started off with the audacious goal of delivering consistent, millisecond ANSI SQL query capability across wide range of data formats. At a high level, this translates to two key requirements – Schema Flexibility and Performance.
Apache Drill provides the users the ability to interact with big data on Hadoop much faster and far more easily using the familiar SQL language. Users are no longer dependent on central IT teams and DBAs to produce schemas and then maintain them when the structure changes for a few records. Drill alleviates the pain associated with structuring unstructured data before one gains any insights by providing a simple mechanism to query any dataset on Hadoop - be it flat files, parquet or JSON files or tables within an HBase table. This session will give you an overview of Drill and how we use powerful java libraries such as Janino, ASM, and Netty to achieve great performance and scalability.
Steven Phillips, Senior Software Engineer
Steven Phillips is a Senior Software Engineer for MapR. Steven played an instrumental role in developing the MapR Enterprise Database Edition for Hadoop. Currently, Steve is focused on Apache Drill, where he is a committer on the Apache Drill team. In addition to Hadoop and Apache Drill, he has comprehensive knowledge of and experience with distributed systems, Java, HBase, Linux, and Bash. Steve has degrees in Physics from both Brigham Young University and the University of California, Berkeley.

Apache Drill - a Schema-free SQL Query Engine for Hadoop and NoSQL