SystemExpertsSystemExperts
Pricing

Patterns

35 items

Horizontal Scaling Pattern

15mbeginner

Retry with Backoff Pattern

15mbeginner

Replication Pattern

25mintermediate

Caching Strategies Pattern

25mintermediate

Persistent Connections Pattern

20mintermediate

Load Balancing Pattern

20mintermediate

Fan-out Pattern

20mintermediate

Fan-in Pattern

20mintermediate

Circuit Breaker Pattern

20mintermediate

Eventual Consistency Pattern

25mintermediate

Queue-based Load Leveling Pattern

20mintermediate

Bloom Filters Pattern

20mintermediate

Time-Series Storage Pattern

20mintermediate

Bulkhead Pattern

20mintermediate

Batch Processing Pattern

20mintermediate

Write-Ahead Log Pattern

20mintermediate

API Gateway Pattern

20mintermediate

Backend for Frontend Pattern

20mintermediate

Sidecar Pattern

20mintermediate

Idempotency Pattern

20mintermediate

Rate Limiting Pattern

20mintermediate

Backpressure Pattern

20mintermediate

Pub/Sub Pattern

25mintermediate

Strong Consistency Pattern

30madvanced

Conflict Resolution Pattern

25madvanced

Leader Election Pattern

25madvanced

Consensus Protocols Pattern

30madvanced

CQRS Pattern

28madvanced

LSM Trees Pattern

25madvanced

Sharding Pattern

25madvanced

Event Sourcing Pattern

30madvanced

Stream Processing Pattern

25madvanced

Change Data Capture Pattern

25madvanced

Distributed Locking Pattern

25madvanced

Two-Phase Commit Pattern

25madvanced
System Design Pattern
Reliabilitycircuit-breakerfault-tolerancecascade-failurefail-fastrecoveryintermediate

Circuit Breaker Pattern

Preventing cascade failures in distributed systems

Used in: Microservices, API Gateways, Service Mesh|20 min read

Summary

Circuit Breaker is a resilience pattern that prevents cascade failures by automatically stopping requests to failing services. Like an electrical circuit breaker, it has three states: Closed (normal operation), Open (failing fast without making requests), and Half-Open (testing if service recovered). When failure rate exceeds a threshold, the circuit "trips" open, giving the downstream service time to recover while failing fast to protect the caller. This pattern is essential for production systems - Netflix's Hystrix library popularized it and prevented countless outages.

Key Takeaways

Three States Prevent Cascade Failures

Closed state operates normally, Open state fails fast without calling the service, Half-Open state tests recovery with limited requests. This state machine prevents a failing service from bringing down its callers.

Automatic Recovery Testing

After a timeout period (e.g., 30-60s), the circuit automatically enters Half-Open state to test if the service recovered. If test requests succeed, it closes; if they fail, it opens again.

Fail Fast Protects Resources

When open, requests fail immediately without consuming threads, connections, or timeouts. This prevents thread pool exhaustion and allows the system to maintain capacity for other operations.

In microservices architectures, services depend on each other. When one service fails or becomes slow, it can bring down all services that depend on it - a cascade failure.

The cascade failure scenario:

  1. Service A calls Service B, which calls Service C
  2. Service C starts failing (database overload, memory leak)
  3. Service B waits for responses, consuming threads
  4. Service B thread pool exhausts, stops responding
  5. Service A exhausts its thread pool waiting for Service B
  6. The entire system collapses like dominoes

Circuit Breaker State Machine

Netflix Origin

Netflix built Hystrix after cascade failures brought down streaming. The library prevented countless outages and influenced all modern circuit breaker implementations including Resilience4j, Polly, and Spring Cloud.

Summary

Circuit Breaker is a resilience pattern that prevents cascade failures by automatically stopping requests to failing services. Like an electrical circuit breaker, it has three states: Closed (normal operation), Open (failing fast without making requests), and Half-Open (testing if service recovered). When failure rate exceeds a threshold, the circuit "trips" open, giving the downstream service time to recover while failing fast to protect the caller. This pattern is essential for production systems - Netflix's Hystrix library popularized it and prevented countless outages.

Key Takeaways

Three States Prevent Cascade Failures

Closed state operates normally, Open state fails fast without calling the service, Half-Open state tests recovery with limited requests. This state machine prevents a failing service from bringing down its callers.

Automatic Recovery Testing

After a timeout period (e.g., 30-60s), the circuit automatically enters Half-Open state to test if the service recovered. If test requests succeed, it closes; if they fail, it opens again.

Fail Fast Protects Resources

When open, requests fail immediately without consuming threads, connections, or timeouts. This prevents thread pool exhaustion and allows the system to maintain capacity for other operations.

Sliding Window Metrics

Modern circuit breakers use sliding windows (count-based or time-based) to calculate failure rates. This makes them responsive to recent failures while being resilient to occasional errors.

Fallback Strategies

When the circuit opens, return cached data, default values, or degraded functionality rather than propagating errors. This provides graceful degradation instead of complete failure.

Observability is Critical

Circuit state changes are important signals. Emit metrics and alerts when circuits open/close. Track failure rates, response times, and circuit state per service endpoint.

Pattern Details

In microservices architectures, services depend on each other. When one service fails or becomes slow, it can bring down all services that depend on it - a cascade failure.

The cascade failure scenario:

  1. Service A calls Service B, which calls Service C
  2. Service C starts failing (database overload, memory leak)
  3. Service B waits for responses, consuming threads
  4. Service B thread pool exhausts, stops responding
  5. Service A exhausts its thread pool waiting for Service B
  6. The entire system collapses like dominoes

Circuit Breaker State Machine

Netflix Origin

Netflix built Hystrix after cascade failures brought down streaming. The library prevented countless outages and influenced all modern circuit breaker implementations including Resilience4j, Polly, and Spring Cloud.

Trade-offs

AspectAdvantageDisadvantage

Premium Content

Sign in to access this content or upgrade for full access.