Skip to content

Apache Kudu 0.10 and Spark SQL

U
Hosted By
user 1.
Apache Kudu 0.10 and Spark SQL

Details

Join the Boston CUG for an update on Apache Kudu and more at our next meetup.

Agenda:

• 6-630pm Food and Drinks

• 630-715 Presentation with Q&A

• 715-8pm Networking

Apache Kudu is a fast new columnar data store for the Hadoop ecosystem, designed to enable high-performance, flexible analytic pipelines. Optimized for lightning-fast scans, Kudu is well-suited for time-series data, machine learning model-building workloads, and data warehousing applications, while also supporting real-time insert, update, and delete operations. Kudu supports being queried by multiple SQL engines, including Apache Spark SQL and Apache Impala (incubating). This talk will give an overview of Kudu's design and capabilities, introduce new features from Kudu 0.10 and the upcoming 1.0 release, and highlight the integration between Kudu and Spark SQL.

Speaker:

William Berkeley is a Solutions Architect at Cloudera. He is an active contributor to Apache Kudu, having worked on the Kudu's web UI, metrics subsystem, tools, columnar storage format, Apache Flume integration, and Apache Spark integration. Thank you to Rakuten for hosting!

Photo of Boston Cloudera User Group group
Boston Cloudera User Group
See more events
Rakuten
2 South Station · Boston, MA