Skip to content

Cassandra London Meetup - on Hadoop Integration

Photo of David Gardner
Hosted By
David G.
Cassandra London Meetup - on Hadoop Integration

Details

This month we are focussing on Hadoop Integration. We have one confirmed speaker and one more in the pipeline.

Schedule:

7.00pm Meet and chat with other Cassandra users
7.30pm Talk: Jairam Chandar explains how to integrate Cassandra and Hadoop
8.00pm Talk: Richard Low from Acunu (http://www.acunu.com/) talks about what we've learned from Cassandra performance testing
8.30pm Finish up; more discussions then off to the pub

Please come along!

Jairam Chandar explains how to integrate Cassandra and Hadoop

Summary

Will be talking about Hadoop-Cassandra integration and how VisualDNA (http://www.visualdna.com/) is using Hadoop to analyse data stored in a Cassandra cluster, including a real-world example and some statistics.

Synopsis

VisualDNA is a behavior-based audience discovery and targeting network. We use a patented visual quiz system to profile audiences at scale and anonymously aggregate this information to help publishers better understand their audience. VisualDNA also runs high performance ad campaigns optimized to maximise revenue for e-commerce sites, and optimize branding campaigns.

We use Cassandra as our primary data-store. With increased volumes of data, simple serial php scripts to run analysis started to take ridiculously long to process. Enter Hadoop! One of the processes that took over 48 hours using a php script was done in just over 4 hours!

Cassandra has been supporting Hadoop since 0.6+ with more and more features being added with newer releases. We will be discussing some of these features with one real-world example (and its not a word-count example!) of how one can use Hadoop for analysis over data stored in Cassandra.

Richard Low from Acunu talks about what we've learned from Cassandra performance testing

Summary

We'll show the effect of heavy write loads on Cassandra, in particular on range queries, and explain how Acunu improves on vanilla Cassandra performance and predictability.

Synopsis

Acunu gives “Big Data” applications high and predictable performance, robustness and simple management. By using the Acunu Storage Platform (http://www.acunu.com/solutions/) to power NOSQL stores such as Cassandra, we enable developers 1-to take full advantage of low cost and high performance commodity hardware, 2-to speed the dev/test cycle with Acunu’s unique instant thin clones by letting each developer work with the whole dataset, and 3-to simplify the management and monitoring of their deployment so they can focus on what matters: their applications.

In the first release of the Acunu Storage Platform, we are focussing on Cassandra and have gone through a significant performance benchmarking exercise. In this session, we will present some of the findings and lessons learned.

Photo of Cassandra London group
Cassandra London
See more events
Skills Matter
The Skills Matter eXchange, 116-120 Goswell Road · London EC1V7DP