Whitepapers

15 items

MapReduce: Simplified Data Processing on Large Clusters

30mintermediate

Kafka: A Distributed Messaging System for Log Processing

30mintermediate

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

25mintermediate

distributed-systemsstoragedatabasenosqlgooglelsm-treewide-columnadvanced

Bigtable: A Distributed Storage System for Structured Data

The sparse, distributed, persistent multi-dimensional sorted map that powers Google Search, Maps, YouTube, and Gmail

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber|Google|2006|35 min read

View Original Paper

Summary

Bigtable is a distributed storage system for managing structured data designed to scale to petabytes across thousands of commodity servers. It provides a simple data model: a sparse, distributed, persistent multi-dimensional sorted map indexed by row key, column key, and timestamp. Bigtable doesn't support a full relational model, but gives clients dynamic control over data layout and format, allowing them to reason about locality. Built on GFS for storage and Chubby for coordination, Bigtable became the foundation for Google's most critical services and inspired an entire generation of NoSQL databases including HBase, Cassandra, and Cloud Bigtable.

Key Takeaways

Sparse Multi-Dimensional Sorted Map

Data is indexed by (row, column, timestamp) and stored in lexicographic row order. Rows are the unit of transactional consistency. This simple abstraction supports wildly different data—from web pages to satellite imagery—with the same API.

Tablets: The Unit of Distribution

Tables are split into tablets (100-200 MB), each holding a contiguous range of rows. Tablets are the unit of distribution and load balancing. As tables grow, tablets split automatically. This enables horizontal scaling without application changes.

Column Families for Locality

Columns are grouped into column families, which are the unit of access control and storage. Data in the same column family is stored together on disk. This allows applications to co-locate related data for efficient access patterns.

By 2006, Google needed to store and query structured data across dozens of services:

Web Search: Billions of URLs, each with content, links, anchors, PageRank, crawl history
Google Earth: Petabytes of satellite imagery with spatial indexes
Google Finance: Time series of stock prices, news, financial data
Personalization: Per-user preferences, history, recommendations

Each project faced similar challenges but had different data patterns.

Google's Data Storage Requirements

Why not use a relational database?

Scale: No RDBMS could handle Google's data volume on commodity hardware
Cost: Commercial databases at this scale would cost billions
Flexibility: Fixed schemas don't fit rapidly evolving products
Control: Google needed to tune storage for specific access patterns

Why not use raw files (GFS)?

No structure: Files don't support efficient point queries
No indexing: Full scans for every query
No locality: Related data scattered across files
Application complexity: Every team reimplements storage logic

Bigtable sits between these extremes: more structure than files, more flexibility than RDBMS.

Summary

Key Takeaways

Sparse Multi-Dimensional Sorted Map

Tablets: The Unit of Distribution

Column Families for Locality

LSM-Tree Storage Engine

Writes go to an in-memory buffer (memtable), then flush to immutable sorted files (SSTables) on GFS. Reads merge data from memtable and SSTables. Compaction merges SSTables to reclaim space and reduce read amplification.

Single-Row Transactions

Bigtable provides atomic read-modify-write operations on single rows. No cross-row transactions—this constraint enables horizontal scaling. Applications requiring multi-row atomicity must implement it at the application layer.

Client-Controlled Locality

Row key design determines data distribution and access patterns. Applications encode locality into keys (e.g., reversed URLs: com.google.www). This gives clients direct control over performance characteristics.

Deep Dive

By 2006, Google needed to store and query structured data across dozens of services:

Web Search: Billions of URLs, each with content, links, anchors, PageRank, crawl history
Google Earth: Petabytes of satellite imagery with spatial indexes
Google Finance: Time series of stock prices, news, financial data
Personalization: Per-user preferences, history, recommendations

Each project faced similar challenges but had different data patterns.

Google's Data Storage Requirements

Why not use a relational database?

Scale: No RDBMS could handle Google's data volume on commodity hardware
Cost: Commercial databases at this scale would cost billions
Flexibility: Fixed schemas don't fit rapidly evolving products
Control: Google needed to tune storage for specific access patterns

Why not use raw files (GFS)?

No structure: Files don't support efficient point queries
No indexing: Full scans for every query
No locality: Related data scattered across files
Application complexity: Every team reimplements storage logic

Bigtable sits between these extremes: more structure than files, more flexibility than RDBMS.

Trade-offs

Aspect	Advantage	Disadvantage
Single-Row Transactions Only	Enables horizontal scaling without distributed coordination; simple programming model for most cases	Applications requiring multi-row atomicity must implement it themselves or use a different database
LSM-Tree Storage	Excellent write throughput; sequential I/O patterns; compression-friendly	Read amplification (check multiple SSTables); write amplification from compaction; space amplification
Sparse Column Model	Schema flexibility; efficient storage of sparse data; dynamic columns without DDL	No schema enforcement; application must handle missing columns; harder to reason about data model
Row Key Determines Distribution	Client controls data locality and access patterns; predictable performance with good key design	Poor key design causes hot spots; changing access patterns may require key redesign
Built on GFS + Chubby	Leverages battle-tested infrastructure; separation of concerns; fault tolerance built in	Requires running complex distributed systems (or using managed service); operational overhead

Premium Content

distributed-systemsstoragedatabasenosqlgooglelsm-treewide-columnadvanced

Bigtable: A Distributed Storage System for Structured Data

The sparse, distributed, persistent multi-dimensional sorted map that powers Google Search, Maps, YouTube, and Gmail

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber|Google|2006|35 min read

View Original Paper

Summary

Key Takeaways

Sparse Multi-Dimensional Sorted Map

Tablets: The Unit of Distribution

Column Families for Locality

By 2006, Google needed to store and query structured data across dozens of services:

Web Search: Billions of URLs, each with content, links, anchors, PageRank, crawl history
Google Earth: Petabytes of satellite imagery with spatial indexes
Google Finance: Time series of stock prices, news, financial data
Personalization: Per-user preferences, history, recommendations

Each project faced similar challenges but had different data patterns.

Google's Data Storage Requirements

Why not use a relational database?

Scale: No RDBMS could handle Google's data volume on commodity hardware
Cost: Commercial databases at this scale would cost billions
Flexibility: Fixed schemas don't fit rapidly evolving products
Control: Google needed to tune storage for specific access patterns

Why not use raw files (GFS)?

No structure: Files don't support efficient point queries
No indexing: Full scans for every query
No locality: Related data scattered across files
Application complexity: Every team reimplements storage logic

Bigtable sits between these extremes: more structure than files, more flexibility than RDBMS.

Summary

Key Takeaways

Sparse Multi-Dimensional Sorted Map

Tablets: The Unit of Distribution

Column Families for Locality

LSM-Tree Storage Engine

Single-Row Transactions

Client-Controlled Locality

Deep Dive

By 2006, Google needed to store and query structured data across dozens of services:

Web Search: Billions of URLs, each with content, links, anchors, PageRank, crawl history
Google Earth: Petabytes of satellite imagery with spatial indexes
Google Finance: Time series of stock prices, news, financial data
Personalization: Per-user preferences, history, recommendations

Each project faced similar challenges but had different data patterns.

Google's Data Storage Requirements

Why not use a relational database?

Scale: No RDBMS could handle Google's data volume on commodity hardware
Cost: Commercial databases at this scale would cost billions
Flexibility: Fixed schemas don't fit rapidly evolving products
Control: Google needed to tune storage for specific access patterns

Why not use raw files (GFS)?

No structure: Files don't support efficient point queries
No indexing: Full scans for every query
No locality: Related data scattered across files
Application complexity: Every team reimplements storage logic

Bigtable sits between these extremes: more structure than files, more flexibility than RDBMS.

Trade-offs

Aspect	Advantage	Disadvantage
Single-Row Transactions Only	Enables horizontal scaling without distributed coordination; simple programming model for most cases	Applications requiring multi-row atomicity must implement it themselves or use a different database
LSM-Tree Storage	Excellent write throughput; sequential I/O patterns; compression-friendly	Read amplification (check multiple SSTables); write amplification from compaction; space amplification
Sparse Column Model	Schema flexibility; efficient storage of sparse data; dynamic columns without DDL	No schema enforcement; application must handle missing columns; harder to reason about data model
Row Key Determines Distribution	Client controls data locality and access patterns; predictable performance with good key design	Poor key design causes hot spots; changing access patterns may require key redesign
Built on GFS + Chubby	Leverages battle-tested infrastructure; separation of concerns; fault tolerance built in	Requires running complex distributed systems (or using managed service); operational overhead

Whitepapers

MapReduce: Simplified Data Processing on Large Clusters

Kafka: A Distributed Messaging System for Log Processing

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

Bitcoin: A Peer-to-Peer Electronic Cash System

In Search of an Understandable Consensus Algorithm (Raft)

TAO: Facebook's Distributed Data Store for the Social Graph

The Google File System

The Log-Structured Merge-Tree (LSM-Tree)

The Chubby Lock Service for Loosely-Coupled Distributed Systems

Spanner: Google's Globally Distributed Database