Skip to content

May 2016 Meetup

U
Hosted By
user 1.
May 2016 Meetup

Details

Topic - Apache Kudu: New Apache Hadoop Storage for Fast Analytics on Fast Data

Please join us for a meetup to discuss Apache Kudu.

Agenda:
6-630pm Food, Drinks and Networking
630-715pm Tech Talk with Q&A
715-8pm Networking

Tech Talk Description:
If you're building relational, time-series, IOT, or real-time architectures using Hadoop, you will find Apache Kudu an attractive choice. With Kudu, you'll be able to build your applications more simply and with fewer moving parts.

Hadoop has become faster and more capable, and has continued to narrow the gap compared to traditional database technologies. However, for developers looking for up-to-the-second analytics on fast-moving data, some important gaps remain that prevent many applications from transitioning to Hadoop-based architectures. Users are often caught between a rock and a hard place: columnar formats such as Apache Parquet offer extremely fast scan rates for analytics, but little to no ability for real-time modification or row-by-row indexed access. Online systems such as HBase offer very fast random access, but scan rates that are too slow for large scale data warehousing and analytical workloads.

This talk will describe Kudu, the new addition to the open source Hadoop ecosystem with out-of-the-box integration with Apache Spark and Apache Impala. Kudu fills the gap described above to provide a new option to achieve fast scans and fast random access from a single API.

Speaker:
Ryan Bosshart is a Systems Engineer where he leads the field storage specialization team.

Photo of DFW Cloudera User Group group
DFW Cloudera User Group
See more events
Research Now
5800 Tennyson Pkwy # 600 · Plano, TX