System Design Fundamentals
11 items
11 items
The foundation of system design - understanding how systems grow from 1 to 1 million users
Scalability is a system's ability to handle increased load by adding resources. There are two approaches: vertical scaling (bigger machines) and horizontal scaling (more machines). The key insight is that scalability is NOT the same as performance—a system can be fast but not scalable, or scalable but slow. Understanding this distinction, along with stateless design principles, is the foundation for all system design decisions.
Performance is how fast a single request completes. Scalability is how well the system handles increasing load. A single-threaded service might respond in 1ms (great performance) but fall over at 1000 QPS (poor scalability). Always clarify which problem you're solving.
Vertical scaling hits hard limits (biggest EC2 is 24TB RAM, 448 vCPUs). Horizontal scaling is theoretically unlimited. But horizontal introduces complexity: distributed state, network partitions, consensus. Choose based on your actual scale needs, not hypothetical future scale.
If a service holds no state between requests, you can add/remove instances freely. Load balancers distribute traffic evenly. This is why REST APIs scale better than WebSocket servers holding connection state.
Scalability is a system's ability to handle growing amounts of work by adding resources. But this definition hides crucial nuance.
Consider two systems: - System A: Handles 100 requests/second. Adding a second server doubles capacity to 200 req/s. - System B: Handles 100 requests/second. Adding a second server increases capacity to 120 req/s.
System A scales linearly—double the resources, double the capacity. System B has diminishing returns—shared state or coordination overhead limits scaling efficiency.
The goal isn't just to scale, but to scale efficiently. This is measured by scalability efficiency:
Efficiency = Actual Throughput / (Theoretical Throughput × Number of Nodes)If 10 servers each capable of 1000 req/s together handle only 7000 req/s, efficiency is 70%. The missing 30% is scaling overhead.
Every system has a bottleneck: CPU, memory, disk I/O, network, or database connections. Scaling only helps if you scale the bottleneck. Adding app servers won't help if your database is saturated.
Premature optimization for scale adds complexity without benefit. A well-written monolith on a single server handles more traffic than most startups will ever see. Scale in response to actual bottlenecks, measured with real data.
Scalability is a system's ability to handle growing amounts of work by adding resources. But this definition hides crucial nuance.
Consider two systems: - System A: Handles 100 requests/second. Adding a second server doubles capacity to 200 req/s. - System B: Handles 100 requests/second. Adding a second server increases capacity to 120 req/s.
System A scales linearly—double the resources, double the capacity. System B has diminishing returns—shared state or coordination overhead limits scaling efficiency.
The goal isn't just to scale, but to scale efficiently. This is measured by scalability efficiency:
Efficiency = Actual Throughput / (Theoretical Throughput × Number of Nodes)If 10 servers each capable of 1000 req/s together handle only 7000 req/s, efficiency is 70%. The missing 30% is scaling overhead.
| Aspect | Advantage | Disadvantage |
|---|---|---|
| Vertical vs Horizontal | Vertical is simpler, no distributed systems complexity, no network partitions to handle | Vertical has hard limits (biggest machine exists), horizontal scales indefinitely but adds significant complexity |
| Stateless Services | Scale linearly by adding servers, any server can handle any request, simple load balancing | State must live somewhere (database/cache becomes bottleneck), requires external session management |
| Early Scaling | Prepared for growth, architecture supports future scale, no emergency rewrites | Premature complexity, slower development, higher costs, solving problems you may never have |
| Distributed Database | No single point of failure, theoretically unlimited scale, geographic distribution possible | Consistency challenges, operational complexity, more failure modes, harder debugging |