Tuesday, October 31, 2006

Paper: "The Chubby Lock Service for Loosely-Coupled Distributed Systems"

The Chubby Lock Service for Loosely-Coupled Distributed Systems
Mike Burrows, Google Inc.


We describe our experiences with the Chubby lock service, which is intended to provide coarse-grained locking as well as reliable (though low-volume) storage for a loosely-coupled distributed system. Chubby provides an interface much like a distributed file system with advisory locks, but the design emphasis is on availability and reliability, as opposed to high performance. Many instances of the service have been used for over a year, with several of them each handling a few tens of thousands of clients concurrently. The paper describes the initial design and expected use, compares it with actual use, and explains how the design had to be modified to accommodate the differences.


Blogger Leonid Ryzhyk said...

Talk summary

Speaker: Mike Burrows

Chubby is a large-scale distributed lock service used in several Google products, including GFS and Bigtable. In his talk, Mike focused mainly on introducing and motivating the Chubby API and the ways in which Chubby has been used, rather than on how it was implemented.

The main purpose of Chubby is to provide distributed systems developers with a reliable and scalable implementation of the distributed consensus protocol. However, experience has shown that, even if implemented as a library, the consensus protocol is still difficult to use for developers. Therefore, Chubby encapsulates it inside familar lock service API.

In addition to providing lock and unlock operations, Chubby allows associating small data records with locks and adopts a UNIX-like naming scheme for them, which makes it look and feel like a file system. However, it is not well-suited for storing large amounts of data and lacks a number of file system features, such as file renaming, atomic multi-file operations, partial-file reads and writes. This lack of features helps simplify the Chubby design and prevents developers from misusing it as a distributed file system.

Lock clients can be notified of certain types of events, including file content changes, file creation/deletion, lock acquisition, etc. Chubby us designed to support large numbers of clients per lock. While changes of the lock ownership are typically infrequent, clients tend to periodically poll the lock, creating a lot of read traffic. To reduce this traffic, a consistent write-through client-side cache is used.


Q: Are there any examples of interactions that you had with the user community that led to the file-system abstraction?
A: No, the design decision happened before the user base. I would put the main reason down to sharing an office with Rob Pike and Sean Quinlan (Plan 9 people), so everything looked like a FS.

Q: Chubby allows developers to easily get reliability guarantees by using a lock server instead of a state machine. Were there other projects that had to go off and implement a state machine?
A: Well, we did. There is a state machine library that we use; at present there are no other users of it.

Q: It seems like large-scale tools are becoming increasingly more integrated. Have you thought about bad interactions where one misbehaving application of a tool can cause cascading failure in other applications?
A: Yes, it happens all the time. My system has managed to bring down many others. What you do is analyse exactly what happened, fix your programming steps and fix your code so that every single problem can't happen again, and of course something else happens next time.

12:16 AM  

Post a Comment

<< Home