System Design Fundamentals
11 items
11 items
The foundation of system design - understanding how systems grow from 1 to 1 million users
Scalability is a system's ability to handle increased load by adding resources. There are two approaches: vertical scaling (bigger machines) and horizontal scaling (more machines). The key insight is that scalability is NOT the same as performance—a system can be fast but not scalable, or scalable but slow. Understanding this distinction, along with stateless design principles, is the foundation for all system design decisions.
Performance is how fast a single request completes. Scalability is how well the system handles increasing load. A single-threaded service might respond in 1ms (great performance) but fall over at 1000 QPS (poor scalability). Always clarify which problem you're solving.
Vertical scaling hits hard limits (biggest EC2 is 24TB RAM, 448 vCPUs). Horizontal scaling is theoretically unlimited. But horizontal introduces complexity: distributed state, network partitions, consensus. Choose based on your actual scale needs, not hypothetical future scale.
If a service holds no state between requests, you can add/remove instances freely. Load balancers distribute traffic evenly. This is why REST APIs scale better than WebSocket servers holding connection state.
Scalability is a system's ability to handle growing amounts of work by adding resources. But this definition hides crucial nuance.
Consider two systems: - System A: Handles 100 requests/second. Adding a second server doubles capacity to 200 req/s. - System B: Handles 100 requests/second. Adding a second server increases capacity to 120 req/s.
System A scales linearly—double the resources, double the capacity. System B has diminishing returns—shared state or coordination overhead limits scaling efficiency.
The goal isn't just to scale, but to scale efficiently. This is measured by scalability efficiency:
Efficiency = Actual Throughput / (Theoretical Throughput × Number of Nodes)If 10 servers each capable of 1000 req/s together handle only 7000 req/s, efficiency is 70%. The missing 30% is scaling overhead.