SystemExpertsSystemExperts
Pricing

System Design Fundamentals

11 items

Scalability Fundamentals

25mbeginner

Latency, Throughput & Performance

30mbeginner

Back-of-Envelope Calculations

25mbeginner

Availability & Reliability Fundamentals

35mintermediate

CAP Theorem & Consistency Models

40mintermediate

Load Balancing Deep Dive

35mintermediate

Asynchronous Processing & Message Queues

30mintermediate

Networking & Protocols

30mintermediate

Caching Strategies

35mintermediate

System Design Fundamentals

20mintermediate

Database Fundamentals

40madvanced
Fundamentalscachingredismemcachedcdncache-invalidationfundamentalssystem-designintermediate

Caching Strategies

From browser to database - a complete guide to caching at every layer

Foundation knowledge|35 min read

Summary

Caching stores copies of frequently accessed data in faster storage to reduce latency and load on origin systems. Caches exist at every layer: browser, CDN, reverse proxy, application, and database. The key challenge is cache invalidation—ensuring cached data stays fresh. Different caching patterns (cache-aside, read-through, write-through, write-behind) offer different consistency-performance trade-offs. Understanding cache eviction policies (LRU, LFU, TTL) and problems like thundering herd and cache stampede is essential for building reliable cached systems.

Key Takeaways

The Two Hard Problems: Naming and Cache Invalidation

Knowing when cached data is stale is genuinely difficult. Too aggressive invalidation defeats the purpose of caching. Too lazy invalidation serves stale data. There's no universal solution—each use case needs its own strategy.

Cache-Aside is the Most Common Pattern

Application checks cache first, falls back to database, then populates cache. It's explicit, flexible, and lets you cache exactly what you need. Most web applications use this pattern with Redis or Memcached.

Write-Through Prevents Stale Reads at the Cost of Write Latency

Writing to cache and database together ensures cache is always fresh. But every write now has cache latency added. Use when read consistency matters more than write speed.

Caching stores copies of data in faster storage to reduce latency and origin load.

The speed hierarchy (latency): - L1 CPU cache: 1 ns - L2 CPU cache: 4 ns - RAM: 100 ns - Redis (local): 0.5 ms - Redis (remote): 1-5 ms - SSD: 0.1 ms - Database query: 1-100 ms - Network API call: 50-500 ms

Caching exploits this hierarchy—store frequently accessed data in faster tiers.

When to cache: - Data is expensive to compute/fetch - Data is accessed frequently - Data doesn't change too often - Stale data is acceptable (briefly)

Caching Reduces Load and Latency

Cache metrics:

  • Hit rate: Percentage of requests served from cache
  • Miss rate: Percentage requiring origin fetch
  • Hit ratio: hits / (hits + misses)
  • Latency reduction: (origin_latency - cache_latency) × hit_rate

Example calculation:

Origin latency: 100ms
Cache latency: 2ms
Hit rate: 90%

Avg latency = (0.9 × 2ms) + (0.1 × 100ms) = 11.8ms
88% latency reduction!

Summary

Caching stores copies of frequently accessed data in faster storage to reduce latency and load on origin systems. Caches exist at every layer: browser, CDN, reverse proxy, application, and database. The key challenge is cache invalidation—ensuring cached data stays fresh. Different caching patterns (cache-aside, read-through, write-through, write-behind) offer different consistency-performance trade-offs. Understanding cache eviction policies (LRU, LFU, TTL) and problems like thundering herd and cache stampede is essential for building reliable cached systems.

Key Takeaways

The Two Hard Problems: Naming and Cache Invalidation

Knowing when cached data is stale is genuinely difficult. Too aggressive invalidation defeats the purpose of caching. Too lazy invalidation serves stale data. There's no universal solution—each use case needs its own strategy.

Cache-Aside is the Most Common Pattern

Application checks cache first, falls back to database, then populates cache. It's explicit, flexible, and lets you cache exactly what you need. Most web applications use this pattern with Redis or Memcached.

Write-Through Prevents Stale Reads at the Cost of Write Latency

Writing to cache and database together ensures cache is always fresh. But every write now has cache latency added. Use when read consistency matters more than write speed.

The Thundering Herd Can Take Down Your Database

When a popular cache key expires, hundreds of requests simultaneously hit the database. Solutions: staggered TTLs, probabilistic early expiration, request coalescing, or cache locks. This is a real production incident pattern.

CDNs Are Geographically Distributed Caches

CDNs cache static content at edge locations worldwide. A user in Tokyo gets content from a Tokyo edge server, not your US origin. This reduces latency by 100-200ms for global users. CDN invalidation is its own challenge.

Not Everything Should Be Cached

Caching adds complexity. If data changes frequently, is rarely accessed, or must always be fresh, caching may hurt more than help. Cache the 20% of data that serves 80% of requests.

Deep Dive

Caching stores copies of data in faster storage to reduce latency and origin load.

The speed hierarchy (latency): - L1 CPU cache: 1 ns - L2 CPU cache: 4 ns - RAM: 100 ns - Redis (local): 0.5 ms - Redis (remote): 1-5 ms - SSD: 0.1 ms - Database query: 1-100 ms - Network API call: 50-500 ms

Caching exploits this hierarchy—store frequently accessed data in faster tiers.

When to cache: - Data is expensive to compute/fetch - Data is accessed frequently - Data doesn't change too often - Stale data is acceptable (briefly)

Caching Reduces Load and Latency

Cache metrics:

  • Hit rate: Percentage of requests served from cache
  • Miss rate: Percentage requiring origin fetch
  • Hit ratio: hits / (hits + misses)
  • Latency reduction: (origin_latency - cache_latency) × hit_rate

Example calculation:

Origin latency: 100ms
Cache latency: 2ms
Hit rate: 90%

Avg latency = (0.9 × 2ms) + (0.1 × 100ms) = 11.8ms
88% latency reduction!

Trade-offs

AspectAdvantageDisadvantage
TTL-Based vs Event-Based InvalidationTTL is simple and requires no infrastructure; Event-based ensures immediate freshnessTTL serves stale data until expiry; Event-based requires pub/sub infrastructure and is complex
Write-Through vs Write-BehindWrite-through ensures cache consistency; Write-behind minimizes write latencyWrite-through adds latency to writes; Write-behind risks data loss if cache fails
Local vs Distributed CacheLocal cache has zero network latency; Distributed cache is shared across instancesLocal cache requires cross-instance invalidation; Distributed cache adds network round trip
Aggressive vs Conservative CachingAggressive caching maximizes performance; Conservative caching ensures freshnessAggressive caching risks stale data; Conservative caching reduces cache benefits

Premium Content

Sign in to access this content or upgrade for full access.