System Design Fundamentals

11 items

Fundamentalsscalabilityhorizontal-scalingvertical-scalingdistributed-systemsfundamentalssystem-designbeginner

Scalability Fundamentals

The foundation of system design - understanding how systems grow from 1 to 1 million users

Foundation knowledge|25 min read

Summary

Scalability is a system's ability to handle increased load by adding resources. There are two approaches: vertical scaling (bigger machines) and horizontal scaling (more machines). The key insight is that scalability is NOT the same as performance—a system can be fast but not scalable, or scalable but slow. Understanding this distinction, along with stateless design principles, is the foundation for all system design decisions.

Key Takeaways

Scalability ≠ Performance

Performance is how fast a single request completes. Scalability is how well the system handles increasing load. A single-threaded service might respond in 1ms (great performance) but fall over at 1000 QPS (poor scalability). Always clarify which problem you're solving.

Horizontal Scales Further Than Vertical

Vertical scaling hits hard limits (biggest EC2 is 24TB RAM, 448 vCPUs). Horizontal scaling is theoretically unlimited. But horizontal introduces complexity: distributed state, network partitions, consensus. Choose based on your actual scale needs, not hypothetical future scale.

Stateless Services Scale Linearly

If a service holds no state between requests, you can add/remove instances freely. Load balancers distribute traffic evenly. This is why REST APIs scale better than WebSocket servers holding connection state.

Scalability is a system's ability to handle growing amounts of work by adding resources. But this definition hides crucial nuance.

Consider two systems: - System A: Handles 100 requests/second. Adding a second server doubles capacity to 200 req/s. - System B: Handles 100 requests/second. Adding a second server increases capacity to 120 req/s.

System A scales linearly—double the resources, double the capacity. System B has diminishing returns—shared state or coordination overhead limits scaling efficiency.

The goal isn't just to scale, but to scale efficiently. This is measured by scalability efficiency:

Efficiency = Actual Throughput / (Theoretical Throughput × Number of Nodes)

If 10 servers each capable of 1000 req/s together handle only 7000 req/s, efficiency is 70%. The missing 30% is scaling overhead.

Summary

Key Takeaways

Scalability ≠ Performance

Horizontal Scales Further Than Vertical

Stateless Services Scale Linearly

State Must Live Somewhere

Pushing state out of application servers doesn't eliminate it—it moves to databases, caches, or message queues. These become your scaling bottlenecks. The art is choosing the right stateful system for your access patterns.

Scaling is About Bottlenecks

Every system has a bottleneck: CPU, memory, disk I/O, network, or database connections. Scaling only helps if you scale the bottleneck. Adding app servers won't help if your database is saturated.

Scale When You Need To, Not Before

Premature optimization for scale adds complexity without benefit. A well-written monolith on a single server handles more traffic than most startups will ever see. Scale in response to actual bottlenecks, measured with real data.

Deep Dive

Scalability is a system's ability to handle growing amounts of work by adding resources. But this definition hides crucial nuance.

System A scales linearly—double the resources, double the capacity. System B has diminishing returns—shared state or coordination overhead limits scaling efficiency.

The goal isn't just to scale, but to scale efficiently. This is measured by scalability efficiency:

Efficiency = Actual Throughput / (Theoretical Throughput × Number of Nodes)

If 10 servers each capable of 1000 req/s together handle only 7000 req/s, efficiency is 70%. The missing 30% is scaling overhead.

Trade-offs

Aspect	Advantage	Disadvantage
Vertical vs Horizontal	Vertical is simpler, no distributed systems complexity, no network partitions to handle	Vertical has hard limits (biggest machine exists), horizontal scales indefinitely but adds significant complexity
Stateless Services	Scale linearly by adding servers, any server can handle any request, simple load balancing	State must live somewhere (database/cache becomes bottleneck), requires external session management
Early Scaling	Prepared for growth, architecture supports future scale, no emergency rewrites	Premature complexity, slower development, higher costs, solving problems you may never have
Distributed Database	No single point of failure, theoretically unlimited scale, geographic distribution possible	Consistency challenges, operational complexity, more failure modes, harder debugging

System Design Fundamentals