Past Meetup

Riak TS + Kudu

This Meetup is past

137 people went

Details

Location/Parking info: For those driving, there is street parking and a parking garage across from our building. Please enter through our main entrance (under the skybridge) or via the skybridge from the parking garage. Go down to first floor, and follow Scalability signs (our team will direct you to) our glass door around the back at Clay and Alaskan Way. Our doors will remain open until 7:45pm to accommodate late comers.

This meetup focuses on Scalability and technologies to enable handling large amounts of data: Hadoop, HBase, distributed NoSQL databases, and more!

There's not only a focus on technology, but also everything surrounding it including operations, management, business use cases, and more.

We've had great success in the past, and are growing quickly! Previous guests were from Twitter, LinkedIn, Amazon, Cloudant, Microsoft, 10gen/MongoDB, and more.

This month's guests:

Rob Genova, is a Sr Architect at Basho, on Raik

Riak TS

Time series data is any data that has a timestamp, like IoT device data, stocks, commodity prices, tide measurements, solar flare tracking, and health information.

This talk will investigate Riak TS which has the same distributed systems functionality as Riak KV plus optimizations for high performance reads and writes of time series data.

+

Dan Burkert is an engineer at Cloudera, on Kudu.

Resolving Transactional Access/Analytic Performance Trade-offs in Hadoop with Kudu

Over the past several years, the Hadoop ecosystem has made great strides in its real-time access capabilities, narrowing the gap compared to traditional database technologies.

This talk will investigate the trade-offs between real-time transactional access and fast analytic performance from the perspective of storage engine internals. It will also describe Kudu, the new addition to the open source Hadoop ecosystem with out-of-the-box integration with Apache Spark, that fills the gap described above to provide a new option to achieve fast scans and fast random access from a single API.

Despite these advances, some important gaps remain that prevent many applications from transitioning to Hadoop-based architectures. Users are often caught between a rock and a hard place: columnar formats such as Apache Parquet offer extremely fast scan rates for analytics, but little to no ability for real-time modification or row-by-row indexed access. Online systems such as HBase offer very fast random access, but scan rates that are too slow for large scale data warehousing workloads.

Our format is flexible: We usually have 2 speakers who talk for ~30 minutes each and then do Q+A plus discussion (about 45 minutes each talk) finish by 8:45.

There'll be beer afterwards, of course!

Meetup Location:

zulily (https://www.google.com/maps/place/2601+Elliott+Ave,+Seattle,+WA+98121/@47.6146404,-122.3552158,17z/data=!3m1!4b1!4m2!3m1!1s0x54901551fab1e35b:0xe45d1952bca3b6c2?hl=en) 2601 Elliot Ave, Seattle, WA, Seattle, WA

After-beer Location: Paddy Coyne’s Irish Pub, 2801 Alaskan Way #103, Seattle, WA 98121. http://www.paddycoynes.com/

Drankin' map: http://bit.ly/ZO5Fxs

After-beer Location:

Doors open 30 minutes ahead of show-time. Please show up at least 15 minutes early out of respect for our first speaker.