Past Meetup

February Hadoop Meetup: Hadoop-as-a-Service & Zookeeper

This Meetup is past

120 people went

Greater London House

Hampstead Road, NW1 7QX · London

How to find us

Just opposite Mornington Crescent tube stop

Location image of event venue


Dear HUG UK members,

I am pleased to announce our February meetup.

As mentioned in our January email, we are planning to have more lightning talks at our evening events.

Therefore, if you would like to give a short lightning talk this time, please get in touch with us ([masked]).

This meetup is sponsored by ASOS.

Details below.

Best wishes,



Tuesday February 18 (previously planned for the 11th) 2014, Doors open 6:30pm.

Presentations from 7:00pm to 8:30pm.


Greater London House, Hampstead Road, London, NW1 7QX

(Just opposite Mornington Crescent tube stop)


Intro: ASOS is a global online fashion and beauty retailer selling over 65,000 branded and own-label products to fashion forward twenty-somethings through our website, We ship, for free, to 237 countries and territories from our 1.1 million square foot global distribution centre in the UK. By way of introduction to ASOS, we’ll spend a few minutes talking about our business, the huge amount of data we collect and some of the exciting things we’re doing with it.

Session 1: Xplenty’s cloud-based Hadoop-as-a-service platform

Speaker: Alex Grach

Abstract: Xplenty’s cloud-based Hadoop-as-a-service platform features a simple GUI, allowing anyone in an organization to benefit from big data processing without needing to program a single line of code. The platform provides 3 key simplification features: 1) Infrastructure - single-click cluster provisioning, cluster optimization and maintenance. 2) Data Processes Development - code-free design environment that saves the user from writing any code whatsoever in order to generate data flow designs, instead you work in the intuitive drag and drop GUI. Data transformation components such as select, sort, filter and join are all included and custom components can be created. 3) Job Management - Xplenty manages scheduling, monitoring and error logging services.

* recently named Xplenty among Top 10 Big Data start-ups ( to watch.
** Here ( is their joint press release with Hortonworks.

Session 2: Is your distributed Zoo under control?

Speaker: Flavio Junqueira

Abstract: Implementing distributed systems is hard. Servers crash, become slow, get partititioned away... These are all events that can happen in a real setting and your distributed system needs to be ready to deal with them. Apache ZooKeeper has been developed to deal with such problems. It is a replicated in-memory system that stores small files called znodes in a hierarchical manner. The ordering guarantees of operations over znodes and the notification scheme it provides enables the implementation of a number of commonly used recipes, like master election, group membership, locks, barriers, etc. Without a component like ZooKeeper, implementing such recipes can be a significant burden because they require sophisticated algorithms with many corner cases that are easy to overlook. ZooKeeper, however, does not make the promise of completely hiding all the problems of a distributed system, but instead to simplify the task. In this presentation, we cover some basic concepts of ZooKeeper, design choices, and caveats.

Short bio: Flavio Junqueira is a member of the research staff of Microsoft Research in Cambridge, UK. He holds a PhD degree in Computer Science from the University of California, San Diego. He is interested in various aspects of distributed systems, including distributed algorithms, concurrency, and scalability. He is an active contributor of Apache projects, such as Apache ZooKeeper (PMC chair and committer) and Apache BookKeeper (committer).