The Ultimate System Design Interview Cheat Sheet (2026)

You're about to walk into a system design interview. You've studied for weeks. But can you recall the exact numbers for latency, throughput, and storage when put on the spot?

This cheat sheet distills everything you need to know into a quick reference. Bookmark it. Print it. Review it the night before your interview.

The Interview Framework

Use this structure for every system design interview:

┌─────────────────────────────────────────────────────────────┐
│ STEP 1: REQUIREMENTS (5 min)                                │
│ • Functional: What features?                                │
│ • Non-functional: Scale, latency, availability?             │
│ • Constraints: Budget, timeline, existing systems?          │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│ STEP 2: ESTIMATION (3 min)                                  │
│ • Users: DAU, MAU                                           │
│ • Traffic: QPS, peak QPS                                    │
│ • Storage: Data size, growth rate                           │
│ • Bandwidth: Read/write throughput                          │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│ STEP 3: HIGH-LEVEL DESIGN (10 min)                          │
│ • Core components and their responsibilities                │
│ • Data flow: How does a request travel?                     │
│ • Data stores: What goes where?                             │
│ • APIs: Key endpoints                                       │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│ STEP 4: DEEP DIVE (20 min)                                  │
│ • Pick 2-3 critical components                              │
│ • Data models and schemas                                   │
│ • Algorithms and data structures                            │
│ • Handle failures and edge cases                            │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│ STEP 5: WRAP-UP (7 min)                                     │
│ • Bottlenecks and scaling strategies                        │
│ • Trade-offs and alternatives considered                    │
│ • Future improvements                                       │
│ • Monitoring and observability                              │
└─────────────────────────────────────────────────────────────┘

Capacity Estimation Numbers

Memorize these. Interviewers expect you to do quick math.

Time Conversions

Unit	Seconds
1 minute	60
1 hour	3,600
1 day	86,400 (~100K)
1 month	2.6M (~2.5M)
1 year	31.5M (~30M)

Quick QPS Estimation

Daily Active Users → QPS

Formula: QPS = (DAU × actions_per_user) / 86,400

Quick rule: 1M DAU ≈ 12 QPS (assuming 1 action/user/day)

Examples:
• 10M DAU, 10 actions/user = 1,200 QPS
• 100M DAU, 1 action/user = 1,200 QPS
• 1B DAU, 1 action/user = 12,000 QPS

Peak QPS: 2-3x average (plan for peaks)

Storage Estimation

Data Type	Size
Character (UTF-8)	1-4 bytes
Integer (32-bit)	4 bytes
Long (64-bit)	8 bytes
UUID	16 bytes
Timestamp	8 bytes
URL (average)	100 bytes
Email (average)	50 bytes
Tweet (280 chars)	~500 bytes (with metadata)
Image (compressed)	100KB - 500KB
Video (1 min, compressed)	10MB - 50MB

Storage Units

Unit	Bytes	Approximation
KB	1,000	10³
MB	1,000,000	10⁶
GB	1,000,000,000	10⁹
TB	1,000,000,000,000	10¹²
PB	10¹⁵	1,000 TB

Latency Numbers Every Developer Should Know

L1 cache reference                    0.5 ns
L2 cache reference                      7 ns
Main memory reference                 100 ns
SSD random read                    16,000 ns (16 µs)
HDD disk seek                  10,000,000 ns (10 ms)
Send 1 KB over 1 Gbps network     10,000 ns (10 µs)
Read 1 MB from memory            250,000 ns (250 µs)
Read 1 MB from SSD             1,000,000 ns (1 ms)
Read 1 MB from HDD            20,000,000 ns (20 ms)
Send packet CA → Netherlands   150,000,000 ns (150 ms)

Key takeaways:

Memory is ~100x faster than SSD
SSD is ~20x faster than HDD
Network latency dominates for distributed systems
Cross-continental round trip: ~150ms

Server Capacity Rules of Thumb

Resource	Typical Capacity
Single server QPS (web)	1,000 - 10,000
Single server QPS (API)	10,000 - 50,000
Single server QPS (static)	100,000+
Redis QPS	100,000+
PostgreSQL QPS (read)	10,000 - 50,000
PostgreSQL QPS (write)	1,000 - 10,000
WebSocket connections	100,000 per server

Core Building Blocks

Load Balancer

Purpose: Distribute traffic across multiple servers

Types:

L4 (Transport): Routes based on IP/port, faster
L7 (Application): Routes based on HTTP content, smarter

Algorithms:

Algorithm	Use Case
Round Robin	Equal servers, stateless
Least Connections	Variable request times
IP Hash	Session affinity needed
Weighted	Different server capacities

Cache

Purpose: Store frequently accessed data in fast storage (memory)

Patterns:

Cache-Aside (Lazy Loading)
1. Check cache
2. If miss → read from DB → write to cache
3. Return data

Read-Through
1. App always reads from cache
2. Cache fetches from DB on miss

Write-Through
1. Write to cache
2. Cache writes to DB synchronously

Write-Behind (Write-Back)
1. Write to cache
2. Cache writes to DB asynchronously

Eviction Policies:

LRU (Least Recently Used): Remove oldest access
LFU (Least Frequently Used): Remove lowest count
TTL (Time to Live): Expire after duration

Common Issues:

Cache stampede: Many requests hit DB when cache expires
Stale data: Cache out of sync with DB
Cold start: Empty cache after restart

Database

SQL vs NoSQL Decision:

Choose SQL When	Choose NoSQL When
ACID required	Eventual consistency OK
Complex queries	Simple key-value lookups
Schema stability	Schema flexibility needed
Strong relationships	Denormalized data
< 10TB data	Massive scale (PB+)

Database Types:

Type	Examples	Use Case
Relational	PostgreSQL, MySQL	Transactions, complex queries
Document	MongoDB, DynamoDB	Flexible schemas, JSON data
Wide-Column	Cassandra, HBase	Time-series, write-heavy
Key-Value	Redis, Memcached	Caching, sessions
Graph	Neo4j, DGraph	Relationships, social networks
Time-Series	InfluxDB, TimescaleDB	Metrics, IoT data

Message Queue

Purpose: Decouple producers from consumers, handle async processing

Types:

Point-to-Point: One consumer per message
Pub/Sub: Multiple consumers per message

When to Use:

Async processing (video encoding, email)
Load leveling (smooth traffic spikes)
Decoupling services (microservices)
Event sourcing (audit logs, replays)

Delivery Guarantees:

Guarantee	Description	Use Case
At-most-once	May lose messages	Metrics (loss OK)
At-least-once	May duplicate	Most applications
Exactly-once	Hardest to achieve	Financial transactions

Scaling Patterns

Horizontal vs Vertical

Vertical Scaling (Scale Up)
┌─────────────────┐        ┌─────────────────┐
│  Small Server   │   →    │  Bigger Server  │
│  (4 CPU, 16GB)  │        │  (64 CPU, 256GB)│
└─────────────────┘        └─────────────────┘

✅ Simple
❌ Has limits, single point of failure


Horizontal Scaling (Scale Out)
┌─────────────────┐        ┌───┐ ┌───┐ ┌───┐ ┌───┐
│  Small Server   │   →    │ S │ │ S │ │ S │ │ S │
└─────────────────┘        └───┘ └───┘ └───┘ └───┘

✅ Unlimited scale, fault tolerant
❌ More complex, requires stateless design

Database Scaling

Read Replicas:

┌────────────┐
│   Primary  │──────────────┐
│  (Writes)  │              │
└────────────┘              │
      │                     │
      ▼                     ▼
┌────────────┐        ┌────────────┐
│  Replica 1 │        │  Replica 2 │
│  (Reads)   │        │  (Reads)   │
└────────────┘        └────────────┘

Use for read-heavy workloads
Async replication (slight lag)

Sharding:

┌──────────────────────────────────────┐
│           Application                 │
└──────────────────────────────────────┘
                 │
    ┌────────────┼────────────┐
    ▼            ▼            ▼
┌───────┐   ┌───────┐   ┌───────┐
│Shard 1│   │Shard 2│   │Shard 3│
│ A-H   │   │ I-P   │   │ Q-Z   │
└───────┘   └───────┘   └───────┘

Sharding Strategies:

Strategy	Pros	Cons
Hash-based	Even distribution	Hard to range query
Range-based	Range queries easy	Hotspots possible
Geographic	Data locality	Complex routing
Directory-based	Flexible	Lookup overhead

Consistent Hashing

        ┌──────────────────────────┐
        │          Ring            │
        │                          │
     Node A                     Node B
        ○─────────────────────────○
       ╱                           ╲
      ╱                             ╲
     ○───────────────────────────────○
  Node D                          Node C

• Hash key → position on ring
• Walk clockwise to find responsible node
• Adding/removing node only affects neighbors
• Virtual nodes for better distribution

Common Design Patterns

Rate Limiting

Algorithms:

Token Bucket:

Bucket: 100 tokens
Refill: 10 tokens/second

Request arrives:
  if tokens > 0:
    tokens -= 1
    allow
  else:
    reject

Sliding Window:

Window: 1 minute
Limit: 100 requests

Request arrives:
  count = requests in last minute
  if count < 100:
    allow
  else:
    reject

Circuit Breaker

     ┌─────────┐     Failures     ┌────────┐
     │ CLOSED  │ ───────────────▶ │  OPEN  │
     │         │                  │        │
     └─────────┘                  └────────┘
          ▲                            │
          │                            │
          │    Success      Timeout    │
          │     ┌────────────┐         │
          └─────│ HALF-OPEN  │◀────────┘
                └────────────┘

States:
• Closed: Normal operation
• Open: Fail fast, don't call downstream
• Half-Open: Allow some requests to test

Saga Pattern (Distributed Transactions)

Service A    Service B    Service C
   │             │            │
   ├────────────▶│            │
   │   Step 1    │            │
   │             ├───────────▶│
   │             │   Step 2   │
   │             │            │ Step 3 fails!
   │             │◀───────────┤
   │◀────────────┤ Compensate │
   │ Compensate  │            │

Data Consistency

CAP Theorem

        Consistency
            ╱╲
           ╱  ╲
          ╱    ╲
         ╱  CP  ╲
        ╱────────╲
       ╱          ╲
      ╱     CA     ╲
     ╱──────────────╲
    ╱________________╲
Availability      Partition
                  Tolerance

Pick 2 of 3 (in reality: P is mandatory, choose C or A)

CP: MongoDB, HBase, Redis Cluster
AP: Cassandra, DynamoDB, CouchDB
CA: Traditional RDBMS (single node)

Consistency Models

Model	Description	Example
Strong	All reads see latest write	Bank balance
Eventual	Reads eventually see writes	Social media likes
Causal	Cause always before effect	Chat messages
Read-your-writes	Users see own writes immediately	Profile updates

ACID vs BASE

ACID	BASE
Atomicity	Basically Available
Consistency	Soft state
Isolation	Eventually consistent
Durability
For: Transactions	For: Scale

Common Architecture Patterns

Microservices

┌─────────┐   ┌─────────┐   ┌─────────┐
│ User    │   │ Order   │   │ Payment │
│ Service │   │ Service │   │ Service │
└────┬────┘   └────┬────┘   └────┬────┘
     │             │             │
     └─────────────┼─────────────┘
                   │
           ┌───────┴───────┐
           │  API Gateway  │
           └───────────────┘

Pros: Independent deployment, scale, tech stack Cons: Distributed complexity, network latency, data consistency

Event-Driven

Producer → Event Bus → Consumer 1
              │
              └──────→ Consumer 2
              │
              └──────→ Consumer 3

Use when:

Loose coupling needed
Multiple consumers for same event
Async processing acceptable

CQRS (Command Query Responsibility Segregation)

       Write Path                 Read Path
           │                          │
           ▼                          ▼
    ┌─────────────┐            ┌─────────────┐
    │  Write DB   │───sync────▶│  Read DB    │
    │ (Normalized)│            │(Denormalized)│
    └─────────────┘            └─────────────┘

Use when:

Read and write patterns differ significantly
High read:write ratio
Complex queries on read side

API Design Quick Reference

REST Best Practices

GET    /users          # List users
GET    /users/123      # Get user 123
POST   /users          # Create user
PUT    /users/123      # Update user 123 (full)
PATCH  /users/123      # Update user 123 (partial)
DELETE /users/123      # Delete user 123

GET    /users/123/orders     # User's orders
POST   /users/123/orders     # Create order for user

HTTP Status Codes

Code	Meaning	Use
200	OK	Success
201	Created	POST success
204	No Content	DELETE success
400	Bad Request	Client error
401	Unauthorized	Auth required
403	Forbidden	No permission
404	Not Found	Resource missing
429	Too Many Requests	Rate limited
500	Internal Server Error	Server bug
503	Service Unavailable	Overloaded

Pagination

Offset-based (simple, inconsistent with changes):
GET /users?offset=20&limit=10

Cursor-based (stable, efficient):
GET /users?cursor=abc123&limit=10

Quick Reference: Technology Choices

When to Use What

Problem	Solution
Cache	Redis, Memcached
Search	Elasticsearch, Algolia
Async jobs	Kafka, RabbitMQ, SQS
Real-time	WebSockets, Server-Sent Events
Object storage	S3, GCS, Azure Blob
CDN	CloudFlare, Fastly, CloudFront
Load balancer	Nginx, HAProxy, ALB
Metrics	Prometheus, DataDog, InfluxDB
Logs	ELK Stack, Splunk, Loki
Relational DB	PostgreSQL, MySQL
Document DB	MongoDB, DynamoDB
Wide-column	Cassandra, HBase, ScyllaDB
Graph DB	Neo4j, DGraph

Interview Phrases to Use

Clarifying Requirements

"Before I dive in, let me clarify the requirements..."
"What's our target scale, thousands or billions of users?"
"Are we optimizing for latency, throughput, or cost?"
"What's the read-to-write ratio?"

Making Design Decisions

"I'm choosing X over Y because..."
"The trade-off here is..."
"If requirements change, we could switch to..."
"For the MVP, I'd start with... then evolve to..."

Discussing Scale

"At our current scale, this works. At 10x, we'd need to..."
"The bottleneck would be... We can address it by..."
"For horizontal scaling, we need to make this stateless..."

Handling Unknowns

"I'm not certain about this, but my approach would be..."
"I'd want to validate this assumption with..."
"That's a great question, let me think through it..."

Final Pre-Interview Checklist

Day Before:

Review this cheat sheet
Rehearse explaining 2-3 systems out loud
Prepare questions to ask the interviewer
Set up your interview environment (quiet room, good internet)
Get a good night's sleep

Morning Of:

Eat a good breakfast
Light exercise or stretching
Quick review of framework and numbers
Arrive/log in 5 minutes early

During Interview:

Take a breath before answering
Clarify requirements first
Draw diagrams as you explain
Check in every 5-7 minutes
Discuss trade-offs proactively
If stuck, verbalize your thinking