Przejdź do treści

Szczegóły

I am very happy to have Jim Baker ( a member of our meetup! ) come and talk to us about some of the core technologies used in may of the big data tools and products. Much of the core behavior that you have come to rely on and expect is due to services provided by Zookeeper and bringing order to all the hardware needed to scale big data systems is made easy with Juju. Please come and hear Jim talk about these very important technologies. Hopefully you can come away from the talk with a better understanding of the technologies that make the big data technologies possible!

Agenda

  • 6:00 - 6:30 - Socialize over food and drink

  • 6:30 - 6:45 - Announcements, Upcoming Events

  • 6:45 - 8:00 - Presentation: Zookeeper for Distributed Coordination and Juju by Jim Baker

  • 8:00 - ?:?? - Continued Socializing

ZooKeeper for Distributed Coordination

Released by Yahoo and now a mature Apache project, ZooKeeper supports distributed coordination. ZooKeeper is a core piece of the Hadoop ecosystem. It is also increasingly used outside Hadoop. One example is the project I work on, Juju, now part of Ubuntu Server, and I will use this example to illustrate what can be done with ZK.

At a low level, ZooKeeper supports working with a tree of nodes - think of a hierarchical filesystem. On top of this, ZooKeeper ensures that all subscribed watchers see a consistent, reliable total ordering of events corresponding to changes in this tree. Lastly, an ensemble of ZooKeeper servers is kept synchronized through the use of the ZooKeeper atomic broadcast protocol so as to ensures availability. So we have consistency and availability, and consequently not partition tolerance. But this CAP selection works well for building distributed systems, and even kickstarting systems that choose, say, eventual consistency instead.

Using these primitives, higher-level distributed protocols like leader election, distributed barriers, and two phase commit can be readily built. High-level APIs like Curator (Java) from Netflix and more recently Kazoo (Python) from Mozilla make it easy to write these protocols. Similarly, ZooKeeper also enables the support for service orchestration in Juju.

Juju simplifies the deployment of user-defined service stacks over their entire lifecycle to both cloud and "bare metal" providers. To accomplish this, Juju implements a service orchestration model, with services described by what we call "charms". Juju manages both the changing relationship between services - whether this is a scalable LAMP stack like MediaWiki or a Hadoop cluster - and the scaling up or down of a given service to one or more machines over the lifetime of the deployed environment.

Users can use Juju to configure the desired service stack in ZooKeeper, and Juju then uses its agents to make it so. By executing idempotent hooks in a charm with the right sequencing, as supported by ZK's distributed coordination semantics, the service orchestration is able to gracefully cope with the eventual consistency and outright failures seen in a distributed environment. But this is done without requiring deep expertise in distributed computing by charm authors. With Juju, simple scripts for hooks can be correct scripts, even in the context of distributed scaling and distributed failure.

This talk will introduce both ZooKeeper and Juju. For ZooKeeper, we will look at the primitives it supports as well as the high level APIs. I will also describe how Juju leverages ZooKeeper, and show you how to write charms and build your own service stacks.

Speaker Biography

Jim Baker works at Canonical as a software engineer on the Ubuntu Server team to support cloud computing, specifically through Juju and its ecosystem. Prior to joining the server team, he was part of the Emerging Technology Group at Canonical that created Juju. Jim is also a lead developer of the Jython implementation. Jim is a graduate of Harvard College and Brown University and is a nominated member of the Python Software Foundation.

Pokrewne tematy

Może ci się również spodobać