Introduction to Kudu


Details
6:00 PM- 6:30 PM: drinks, mingling
6:30 PM - 8:30PM: Introduction to Kudu
Over the past several years, the Hadoop ecosystem has made great strides in its real-time access capabilities, narrowing the gap compared to traditional database technologies. With systems such as Impala and Spark, analysts can now run complex queries or jobs over large datasets within a matter of seconds. With systems such as Apache HBase and Apache Phoenix, applications can achieve millisecond-scale random access to arbitrarily-sized datasets.
Despite these advances, some important gaps remain that prevent many applications from transitioning to Hadoop-based architectures. Users are often caught between a rock and a hard place: columnar formats such as Apache Parquet offer extremely fast scan rates for analytics, but little to no ability for real-time modification or row-by-row indexed access. Online systems such as HBase offer very fast random access, but scan rates that are too slow for large scale data warehousing workloads.
This talk will investigate the trade-offs between real-time transactional access and fast analytic performance from the perspective of storage engine internals. It will also describe Kudu, the new addition to the open source Hadoop ecosystem that fills the gap described above, complementing HDFS and HBase to provide a new option to achieve fast scans and fast random access from a single API.
Speakers:
Shravan (Sean) Pabba is a Systems Engineer at Cloudera. He works with Cloudera customers and prospects in helping them architect and build applications using Cloudera Hadoop Distribution. Before Cloudera Sean worked as a Solutions Architect at various companies including GigaSpaces and IBM, where he was involved in architecture, design and development of distributed and mainframe applications.
Alan Sanie is a System Engineer at Cloudera where he works with customers in the Mid Atlantic region on successfully adopting and implementing Big Data and Hadoop based technologies. Prior to joining Cloudera Alan worked in various technical management roles at IBM Software Group in different groups including Raitonal, Middleware and Watson IoT.
This meetup will be at WEWORK MARKET ST.
1601 Market Street Philadelphia PA 19103 (19th floor)
About our sponsor:
WeWork is a community for creators. We transform buildings into
beautiful, collaborative workspaces and provide the infrastructure, services,
events and technology so our members can focus on doing what they love.
WeWork currently has 111 locations in 29 cities across the world with over
70,000 members. Book a tour at wework.com now!

Introduction to Kudu