SystemExpertsSystemExperts
Pricing

Whitepapers

15 items

MapReduce: Simplified Data Processing on Large Clusters

30mintermediate

Kafka: A Distributed Messaging System for Log Processing

30mintermediate

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

25mintermediate

Bitcoin: A Peer-to-Peer Electronic Cash System

30mintermediate

In Search of an Understandable Consensus Algorithm (Raft)

35madvanced

TAO: Facebook's Distributed Data Store for the Social Graph

35madvanced

The Google File System

35madvanced

The Log-Structured Merge-Tree (LSM-Tree)

35madvanced

The Chubby Lock Service for Loosely-Coupled Distributed Systems

30madvanced

Spanner: Google's Globally Distributed Database

40madvanced

Bigtable: A Distributed Storage System for Structured Data

35madvanced

Scaling Memcache at Facebook

35madvanced

Large-scale cluster management at Google with Borg

35madvanced

The Next 700 Programming Languages

30madvanced

The Part-Time Parliament

40madvanced
distributed-systemscoordinationconsensuslockinggooglezookeeperpaxosadvanced

The Chubby Lock Service for Loosely-Coupled Distributed Systems

Google's coordination service that underpins GFS, Bigtable, and MapReduce—and inspired ZooKeeper

Mike Burrows|Google|2006|30 min read
View Original Paper

Summary

Chubby is a distributed lock service that provides coarse-grained locking and reliable small-file storage for loosely-coupled distributed systems. Rather than exposing raw Paxos consensus to developers, Chubby wraps it in a familiar file-system interface with locks. It serves as the coordination backbone for Google's infrastructure—GFS uses it for master election, Bigtable for tablet server coordination, and MapReduce for task assignment. The design prioritizes availability and reliability over raw performance, using aggressive caching and sessions to handle thousands of clients.

Key Takeaways

Lock Service Over Consensus Library

Google chose a centralized lock service over a Paxos library because: (1) developers are more familiar with locks than consensus protocols, (2) a service reduces the number of servers needed for consensus, and (3) a lock service naturally provides a place to store small amounts of metadata.

Coarse-Grained Locks

Chubby is designed for locks held for hours or days, not milliseconds. This enables aggressive client-side caching and session-based semantics. Fine-grained locks would require too much server load and wouldn't benefit from caching.

File-System Interface

Chubby presents a simple file system with directories and files (nodes). Each node can act as a lock and store small data. This familiar interface hides Paxos complexity while enabling both locking and configuration storage.

By the mid-2000s, Google's infrastructure consisted of thousands of machines running distributed systems like GFS, Bigtable, and MapReduce. These systems faced common coordination challenges:

Leader Election: GFS needs a single master at any time. What happens when the master crashes? How do we elect a new one without split-brain?

Configuration Management: Bigtable tablet servers need to know the current master's address. How do we distribute this reliably?

Service Discovery: MapReduce workers need to find the job coordinator. How do we maintain an up-to-date registry?

Distributed Locking: Multiple processes need exclusive access to resources. How do we implement locks that survive failures?

The naive solution—each team implements Paxos—had problems:

Why a Lock Service Instead of Paxos Libraries

Key insight: A lock service is a better abstraction than a consensus library for most developers.

  1. Fewer servers: Instead of 5 servers per service, one Chubby cell serves an entire datacenter
  2. Familiar interface: Developers understand locks and files; few understand Paxos
  3. Dual purpose: The same service provides locks AND stores small metadata (master address, configuration)
  4. Battle-tested: One well-engineered service is better than many ad-hoc implementations

Summary

Chubby is a distributed lock service that provides coarse-grained locking and reliable small-file storage for loosely-coupled distributed systems. Rather than exposing raw Paxos consensus to developers, Chubby wraps it in a familiar file-system interface with locks. It serves as the coordination backbone for Google's infrastructure—GFS uses it for master election, Bigtable for tablet server coordination, and MapReduce for task assignment. The design prioritizes availability and reliability over raw performance, using aggressive caching and sessions to handle thousands of clients.

Key Takeaways

Lock Service Over Consensus Library

Google chose a centralized lock service over a Paxos library because: (1) developers are more familiar with locks than consensus protocols, (2) a service reduces the number of servers needed for consensus, and (3) a lock service naturally provides a place to store small amounts of metadata.

Coarse-Grained Locks

Chubby is designed for locks held for hours or days, not milliseconds. This enables aggressive client-side caching and session-based semantics. Fine-grained locks would require too much server load and wouldn't benefit from caching.

File-System Interface

Chubby presents a simple file system with directories and files (nodes). Each node can act as a lock and store small data. This familiar interface hides Paxos complexity while enabling both locking and configuration storage.

Sessions and KeepAlives

Clients maintain sessions with the Chubby cell via periodic KeepAlives. Sessions bundle lease extensions, cache invalidations, and event delivery. If a session expires, all locks and ephemeral files are released—critical for failure detection.

Sequencers for Lock Safety

To prevent issues with delayed messages, Chubby provides sequencers—opaque tokens encoding lock identity and generation. Servers can verify a client still holds the lock by checking the sequencer, preventing race conditions.

Aggressive Caching

Clients cache file data and metadata indefinitely, with write-through cache coherence. This transforms Chubby from a bottleneck into a highly scalable read-heavy system—critical since most clients only need to read configuration data.

Deep Dive

By the mid-2000s, Google's infrastructure consisted of thousands of machines running distributed systems like GFS, Bigtable, and MapReduce. These systems faced common coordination challenges:

Leader Election: GFS needs a single master at any time. What happens when the master crashes? How do we elect a new one without split-brain?

Configuration Management: Bigtable tablet servers need to know the current master's address. How do we distribute this reliably?

Service Discovery: MapReduce workers need to find the job coordinator. How do we maintain an up-to-date registry?

Distributed Locking: Multiple processes need exclusive access to resources. How do we implement locks that survive failures?

The naive solution—each team implements Paxos—had problems:

Why a Lock Service Instead of Paxos Libraries

Key insight: A lock service is a better abstraction than a consensus library for most developers.

  1. Fewer servers: Instead of 5 servers per service, one Chubby cell serves an entire datacenter
  2. Familiar interface: Developers understand locks and files; few understand Paxos
  3. Dual purpose: The same service provides locks AND stores small metadata (master address, configuration)
  4. Battle-tested: One well-engineered service is better than many ad-hoc implementations

Trade-offs

AspectAdvantageDisadvantage
Centralized Service vs LibraryFewer servers needed (one cell per DC); consistent, battle-tested implementation; natural metadata storageSingle point of dependency; all services must trust the same Chubby cell; cross-DC coordination requires separate cells
Coarse-Grained LocksEnables aggressive caching; session semantics provide clean failure model; low overhead per lockNot suitable for fine-grained coordination; lock acquisition latency (seconds) too slow for some use cases
Strict Consistency (Master-Only Reads)Linearizable reads guarantee correctness; simpler reasoning about system behaviorMaster is read bottleneck (mitigated by caching); higher read latency than follower-read systems
File-System InterfaceFamiliar abstraction for developers; naturally combines locking with data storageTempts developers to use it as general storage; file metaphor may not fit all coordination needs
Session-Based Failure DetectionClean failure model—crash releases all locks and ephemeral files; clients know their stateSession timeout adds latency to failure detection; network partition may cause false positives

Premium Content

Sign in to access this content or upgrade for full access.