Patterns
35 items
35 items
Controlling request rates to protect resources
Rate limiting controls how many requests a client can make in a given time window, protecting systems from abuse, ensuring fair usage, and preventing cascading failures. Common algorithms include Token Bucket (allows bursts), Leaky Bucket (smooth rate), Fixed Window (simple), and Sliding Window (accurate). Rate limits can be applied per user, API key, IP address, or globally. Every public API uses rate limiting - Twitter limits 300 tweets/3hrs, GitHub limits 5000 requests/hour. Essential for both protecting your system and providing fair access.
Rate limit by user, API key, IP, endpoint, or combinations. Different limits for different tiers (free vs paid). Protect both per-user and globally.
Bucket fills with tokens at steady rate. Each request consumes token. Allows short bursts up to bucket size. Good for API rate limiting.
Fixed windows have boundary issues (double rate at window boundary). Sliding window smooths this but requires more state.
Token Bucket: - Bucket holds N tokens (burst capacity) - Tokens added at rate R per second - Request takes 1 token - No token = rejected - Allows bursts up to N
Leaky Bucket: - Requests enter bucket - Processed at fixed rate - Overflow rejected - Smooth output rate
Fixed Window: - Count requests in fixed intervals (per minute) - Reset count at interval boundary - Simple but has boundary spike issue
Sliding Window: - Weighted count across windows - Smoother than fixed window - More accurate, more state