SystemExpertsSystemExperts
Pricing

Whitepapers

15 items

MapReduce: Simplified Data Processing on Large Clusters

30mintermediate

Kafka: A Distributed Messaging System for Log Processing

30mintermediate

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

25mintermediate

Bitcoin: A Peer-to-Peer Electronic Cash System

30mintermediate

In Search of an Understandable Consensus Algorithm (Raft)

35madvanced

TAO: Facebook's Distributed Data Store for the Social Graph

35madvanced

The Google File System

35madvanced

The Log-Structured Merge-Tree (LSM-Tree)

35madvanced

The Chubby Lock Service for Loosely-Coupled Distributed Systems

30madvanced

Spanner: Google's Globally Distributed Database

40madvanced

Bigtable: A Distributed Storage System for Structured Data

35madvanced

Scaling Memcache at Facebook

35madvanced

Large-scale cluster management at Google with Borg

35madvanced

The Next 700 Programming Languages

30madvanced

The Part-Time Parliament

40madvanced
distributed-systemsstoragedatabasenosqlgooglelsm-treewide-columnadvanced

Bigtable: A Distributed Storage System for Structured Data

The sparse, distributed, persistent multi-dimensional sorted map that powers Google Search, Maps, YouTube, and Gmail

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber|Google|2006|35 min read
View Original Paper

Summary

Bigtable is a distributed storage system for managing structured data designed to scale to petabytes across thousands of commodity servers. It provides a simple data model: a sparse, distributed, persistent multi-dimensional sorted map indexed by row key, column key, and timestamp. Bigtable doesn't support a full relational model, but gives clients dynamic control over data layout and format, allowing them to reason about locality. Built on GFS for storage and Chubby for coordination, Bigtable became the foundation for Google's most critical services and inspired an entire generation of NoSQL databases including HBase, Cassandra, and Cloud Bigtable.

Key Takeaways

Sparse Multi-Dimensional Sorted Map

Data is indexed by (row, column, timestamp) and stored in lexicographic row order. Rows are the unit of transactional consistency. This simple abstraction supports wildly different data—from web pages to satellite imagery—with the same API.

Tablets: The Unit of Distribution

Tables are split into tablets (100-200 MB), each holding a contiguous range of rows. Tablets are the unit of distribution and load balancing. As tables grow, tablets split automatically. This enables horizontal scaling without application changes.

Column Families for Locality

Columns are grouped into column families, which are the unit of access control and storage. Data in the same column family is stored together on disk. This allows applications to co-locate related data for efficient access patterns.

By 2006, Google needed to store and query structured data across dozens of services:

  • Web Search: Billions of URLs, each with content, links, anchors, PageRank, crawl history
  • Google Earth: Petabytes of satellite imagery with spatial indexes
  • Google Finance: Time series of stock prices, news, financial data
  • Personalization: Per-user preferences, history, recommendations

Each project faced similar challenges but had different data patterns.

Google's Data Storage Requirements

Why not use a relational database?

  1. Scale: No RDBMS could handle Google's data volume on commodity hardware
  2. Cost: Commercial databases at this scale would cost billions
  3. Flexibility: Fixed schemas don't fit rapidly evolving products
  4. Control: Google needed to tune storage for specific access patterns

Why not use raw files (GFS)?

  1. No structure: Files don't support efficient point queries
  2. No indexing: Full scans for every query
  3. No locality: Related data scattered across files
  4. Application complexity: Every team reimplements storage logic

Bigtable sits between these extremes: more structure than files, more flexibility than RDBMS.

Summary

Bigtable is a distributed storage system for managing structured data designed to scale to petabytes across thousands of commodity servers. It provides a simple data model: a sparse, distributed, persistent multi-dimensional sorted map indexed by row key, column key, and timestamp. Bigtable doesn't support a full relational model, but gives clients dynamic control over data layout and format, allowing them to reason about locality. Built on GFS for storage and Chubby for coordination, Bigtable became the foundation for Google's most critical services and inspired an entire generation of NoSQL databases including HBase, Cassandra, and Cloud Bigtable.

Key Takeaways

Sparse Multi-Dimensional Sorted Map

Data is indexed by (row, column, timestamp) and stored in lexicographic row order. Rows are the unit of transactional consistency. This simple abstraction supports wildly different data—from web pages to satellite imagery—with the same API.

Tablets: The Unit of Distribution

Tables are split into tablets (100-200 MB), each holding a contiguous range of rows. Tablets are the unit of distribution and load balancing. As tables grow, tablets split automatically. This enables horizontal scaling without application changes.

Column Families for Locality

Columns are grouped into column families, which are the unit of access control and storage. Data in the same column family is stored together on disk. This allows applications to co-locate related data for efficient access patterns.

LSM-Tree Storage Engine

Writes go to an in-memory buffer (memtable), then flush to immutable sorted files (SSTables) on GFS. Reads merge data from memtable and SSTables. Compaction merges SSTables to reclaim space and reduce read amplification.

Single-Row Transactions

Bigtable provides atomic read-modify-write operations on single rows. No cross-row transactions—this constraint enables horizontal scaling. Applications requiring multi-row atomicity must implement it at the application layer.

Client-Controlled Locality

Row key design determines data distribution and access patterns. Applications encode locality into keys (e.g., reversed URLs: com.google.www). This gives clients direct control over performance characteristics.

Deep Dive

By 2006, Google needed to store and query structured data across dozens of services:

  • Web Search: Billions of URLs, each with content, links, anchors, PageRank, crawl history
  • Google Earth: Petabytes of satellite imagery with spatial indexes
  • Google Finance: Time series of stock prices, news, financial data
  • Personalization: Per-user preferences, history, recommendations

Each project faced similar challenges but had different data patterns.

Google's Data Storage Requirements

Why not use a relational database?

  1. Scale: No RDBMS could handle Google's data volume on commodity hardware
  2. Cost: Commercial databases at this scale would cost billions
  3. Flexibility: Fixed schemas don't fit rapidly evolving products
  4. Control: Google needed to tune storage for specific access patterns

Why not use raw files (GFS)?

  1. No structure: Files don't support efficient point queries
  2. No indexing: Full scans for every query
  3. No locality: Related data scattered across files
  4. Application complexity: Every team reimplements storage logic

Bigtable sits between these extremes: more structure than files, more flexibility than RDBMS.

Trade-offs

AspectAdvantageDisadvantage
Single-Row Transactions OnlyEnables horizontal scaling without distributed coordination; simple programming model for most casesApplications requiring multi-row atomicity must implement it themselves or use a different database
LSM-Tree StorageExcellent write throughput; sequential I/O patterns; compression-friendlyRead amplification (check multiple SSTables); write amplification from compaction; space amplification
Sparse Column ModelSchema flexibility; efficient storage of sparse data; dynamic columns without DDLNo schema enforcement; application must handle missing columns; harder to reason about data model
Row Key Determines DistributionClient controls data locality and access patterns; predictable performance with good key designPoor key design causes hot spots; changing access patterns may require key redesign
Built on GFS + ChubbyLeverages battle-tested infrastructure; separation of concerns; fault tolerance built inRequires running complex distributed systems (or using managed service); operational overhead

Premium Content

Sign in to access this content or upgrade for full access.