SystemExpertsSystemExperts
Pricing

Patterns

35 items

Horizontal Scaling Pattern

15mbeginner

Retry with Backoff Pattern

15mbeginner

Replication Pattern

25mintermediate

Caching Strategies Pattern

25mintermediate

Persistent Connections Pattern

20mintermediate

Load Balancing Pattern

20mintermediate

Fan-out Pattern

20mintermediate

Fan-in Pattern

20mintermediate

Circuit Breaker Pattern

20mintermediate

Eventual Consistency Pattern

25mintermediate

Queue-based Load Leveling Pattern

20mintermediate

Bloom Filters Pattern

20mintermediate

Time-Series Storage Pattern

20mintermediate

Bulkhead Pattern

20mintermediate

Batch Processing Pattern

20mintermediate

Write-Ahead Log Pattern

20mintermediate

API Gateway Pattern

20mintermediate

Backend for Frontend Pattern

20mintermediate

Sidecar Pattern

20mintermediate

Idempotency Pattern

20mintermediate

Rate Limiting Pattern

20mintermediate

Backpressure Pattern

20mintermediate

Pub/Sub Pattern

25mintermediate

Strong Consistency Pattern

30madvanced

Conflict Resolution Pattern

25madvanced

Leader Election Pattern

25madvanced

Consensus Protocols Pattern

30madvanced

CQRS Pattern

28madvanced

LSM Trees Pattern

25madvanced

Sharding Pattern

25madvanced

Event Sourcing Pattern

30madvanced

Stream Processing Pattern

25madvanced

Change Data Capture Pattern

25madvanced

Distributed Locking Pattern

25madvanced

Two-Phase Commit Pattern

25madvanced
System Design Pattern
Storagelsm-treewrite-optimizedcompactionmemtablesstableadvanced

LSM Trees Pattern

Write-optimized storage with log-structured merge trees

Used in: RocksDB, Cassandra, HBase|25 min read

Summary

Log-Structured Merge Trees (LSM Trees) are write-optimized data structures used by databases like Cassandra, RocksDB, LevelDB, and HBase. Writes go to an in-memory buffer (memtable), which is periodically flushed to immutable sorted files (SSTables) on disk. Background compaction merges SSTables to maintain read performance. LSM trees achieve 10-100x better write throughput than B-trees by converting random writes to sequential I/O. The trade-off is read amplification (checking multiple levels) and write amplification (data written multiple times during compaction).

Key Takeaways

Writes Are Always Fast

Writes go to memory (memtable) first. No disk seek required. Periodically flush to disk as sequential write. This is why LSM is write-optimized.

Immutable SSTables

Once written, SSTable files never change. Simplifies concurrency (no locks needed for reads). Enables efficient caching. Updates/deletes create new entries, not modifications.

Compaction is Essential

Without compaction, reads must check all SSTables. Compaction merges SSTables, removes obsolete entries, maintains sorted order. Different strategies: leveled, tiered, FIFO.

LSM Tree Architecture

Write Path: 1. Write to WAL (durability) 2. Insert into memtable (sorted structure, e.g., red-black tree) 3. When memtable full, flush to SSTable 4. Background compaction merges SSTables

Read Path: 1. Check memtable 2. Check each level's SSTables (newest first) 3. Use bloom filters to skip SSTables 4. Binary search within SSTable

Summary

Log-Structured Merge Trees (LSM Trees) are write-optimized data structures used by databases like Cassandra, RocksDB, LevelDB, and HBase. Writes go to an in-memory buffer (memtable), which is periodically flushed to immutable sorted files (SSTables) on disk. Background compaction merges SSTables to maintain read performance. LSM trees achieve 10-100x better write throughput than B-trees by converting random writes to sequential I/O. The trade-off is read amplification (checking multiple levels) and write amplification (data written multiple times during compaction).

Key Takeaways

Writes Are Always Fast

Writes go to memory (memtable) first. No disk seek required. Periodically flush to disk as sequential write. This is why LSM is write-optimized.

Immutable SSTables

Once written, SSTable files never change. Simplifies concurrency (no locks needed for reads). Enables efficient caching. Updates/deletes create new entries, not modifications.

Compaction is Essential

Without compaction, reads must check all SSTables. Compaction merges SSTables, removes obsolete entries, maintains sorted order. Different strategies: leveled, tiered, FIFO.

Read Amplification

Point lookup may check memtable + multiple SSTables. Bloom filters help skip SSTables. Still slower than B-tree point lookups. Range scans can be efficient.

Write Amplification

Data written to memtable, then SSTable, then compacted multiple times. Total writes = 10-30x logical writes. Trade-off for write throughput.

Space Amplification

Obsolete entries exist until compacted. Multiple versions of same key in different levels. Need more space than logical data size.

Pattern Details

LSM Tree Architecture

Write Path: 1. Write to WAL (durability) 2. Insert into memtable (sorted structure, e.g., red-black tree) 3. When memtable full, flush to SSTable 4. Background compaction merges SSTables

Read Path: 1. Check memtable 2. Check each level's SSTables (newest first) 3. Use bloom filters to skip SSTables 4. Binary search within SSTable

Trade-offs

AspectAdvantageDisadvantage

Premium Content

Sign in to access this content or upgrade for full access.