Note: Building closes at 7pm; lobby doors will lock at that time!
• 6-630pm: Foods & bevs
• 630pm-7pm: Query Planning in Impala (Alex Behm)
Alex will present an overview of query compilation in Impala with a focus on query planning. Using example queries and plans, he will explain the most important query-optimization techniques implemented in Impala. Attendees will leave with a reasonable understanding of the importance of query planning, how statistics affect planning outcomes, and how Impala's planning process works.
• 7pm-730pm: Overview of Admission Control in Impala (Matt Jacobs)
Impala 1.3 introduced a new admission control mechanism to avoid oversubscription of cluster resources in times of heavy load. The admission controller was designed to be fast and resilient to failures, so it is decentralized; each Impala daemon makes local decisions and disseminate load information via the Impala statestore. This talk will describe how the admission controller works and things to consider when tuning the admission controller for your workload.
• 730pm-8pm: Python UDFs in Impala (Uri Laserson)
Impala provides the ability to easily analyze large, distributed data sets. This talk will cover the impyla package, which aims to make data science easier with Impala by integrating with Python. The impyla package currently supports programmatically interacting with Impala, running distributed machine learning in Impala, and compiling Python UDFs into assembly instructions via LLVM.