Patterns
35 items
35 items
Distributing traffic across multiple servers
Load balancing distributes incoming traffic across multiple servers to prevent any single server from becoming overwhelmed. Load balancers use algorithms like Round Robin (simple rotation), Least Connections (route to least busy), Weighted (proportional to capacity), and IP Hash (sticky sessions). Modern load balancers also perform health checks, SSL termination, and can route based on content. This pattern is fundamental to horizontal scaling - without load balancing, you cannot utilize multiple servers effectively. Every production system uses load balancing, from simple DNS round-robin to sophisticated L7 application load balancers.
Goal is to spread load evenly so no server is overwhelmed while others are idle. Uneven distribution wastes capacity and creates bottlenecks.
Load balancer must detect unhealthy servers and stop sending traffic. Without health checks, requests go to dead servers and fail.
L4 (transport) routes based on IP/port - fast, simple. L7 (application) routes based on HTTP content - flexible, can route by URL, headers, cookies.
Round Robin: Rotate through servers sequentially - Simple, fair distribution - Does not account for server capacity or current load
Least Connections: Route to server with fewest active connections - Better for varying request durations - Requires connection tracking
Weighted: Route proportionally to server capacity - Server with weight 2 gets 2x traffic of weight 1 - Good for mixed-capacity fleet
IP Hash: Hash client IP to determine server - Same client always goes to same server - Provides sticky sessions without cookies
Least Response Time: Route to fastest responding server - Optimizes for latency - Requires latency monitoring