Patterns
35 items
35 items
Using queues to smooth traffic spikes
Queue-based load leveling uses message queues to decouple request producers from processors, smoothing traffic spikes and preventing system overload. Instead of synchronously processing requests during peak load, requests are queued and processed at a sustainable rate. This pattern transforms bursty traffic into steady throughput, prevents cascading failures during spikes, and enables independent scaling of producers and consumers. The queue acts as a shock absorber that protects backend services from traffic tsunamis.
During sudden load spikes (flash sales, viral posts), queues buffer requests instead of overwhelming backends. System processes at sustainable rate while users get acknowledgment.
Producers (API servers) and consumers (workers) scale independently. Add API servers for more ingestion, add workers for faster processing.
Users get immediate response with job ID instead of waiting for slow operations. Check status later or receive notifications when complete.
Why synchronous processing fails:
Provisioning for peak load wastes money 95% of the time. Provisioning for average fails during spikes.