Design Walkthrough
Problem Statement
The Question: Design a price alert system that handles millions of user-defined alerts and notifies users when prices hit their targets.
Use Cases: - E-commerce: "Alert me when iPhone drops below $800" - Stock trading: "Alert me when AAPL goes above $200" - Travel: "Alert me when flight to NYC drops below $300" - Crypto: "Alert me when BTC drops 10% in 24 hours"
This is a common interview question because it combines: - Stream processing (continuous price updates) - Efficient data structures (alert matching) - Notification systems (delivery at scale) - Database design (alert storage and indexing)
What to say first
Before I design, let me clarify the requirements. I want to understand the scale of alerts, frequency of price updates, and latency expectations for notifications.
Hidden requirements interviewers test: - Can you design efficient matching algorithms? - Do you understand notification delivery challenges? - Can you handle the asymmetry (many alerts, frequent updates)? - Do you consider edge cases (price gaps, duplicate alerts)?
Clarifying Questions
Ask these questions to shape your architecture. Each answer has significant design implications.
Question 1: Scale
How many active alerts and how many price updates per second?
Why this matters: Determines if simple scanning works or if you need sophisticated indexing. Typical answer: 10M active alerts, 100K price updates/sec Architecture impact: Need inverted index, cannot scan all alerts per update
Question 2: Alert Types
What types of conditions? Simple threshold (price < X) or complex (price drops 10% in 24h)?
Why this matters: Complex conditions require storing historical data and more sophisticated matching. Typical answer: Start with simple thresholds, mention complex as extension Architecture impact: Simple thresholds enable range-based indexing
Question 3: Notification Latency
How quickly must users be notified after price crosses threshold?
Why this matters: Real-time (seconds) vs near-real-time (minutes) changes architecture significantly. Typical answer: Within 1 minute for e-commerce, within seconds for stocks Architecture impact: Streaming architecture vs batch processing
Question 4: Alert Lifecycle
One-time alerts (delete after trigger) or recurring (alert every time price crosses)?
Why this matters: Affects state management and prevents notification spam. Typical answer: One-time by default, with cooldown option for recurring Architecture impact: Need to track triggered state and implement cooldowns
Stating assumptions
I will assume: 10M active alerts, 100K price updates/sec, simple threshold conditions (price above/below X), notification within 1 minute, one-time alerts that auto-delete after triggering.
The Hard Part
Say this out loud
The hard part here is efficiently matching millions of alerts against a continuous stream of price updates without scanning all alerts for each update.
Why this is genuinely hard:
The Naive Approach Fails: - 10M alerts, 100K price updates/sec - If we check all alerts for each update: 10M x 100K = 1 trillion comparisons/sec - This is computationally impossible
The Insight: - Most price updates affect very few alerts - If AAPL price changes, only AAPL alerts need checking - Within AAPL alerts, only those near current price might trigger
This leads to two-level filtering: 1. Partition by item: Only check alerts for the item whose price changed 2. Index by threshold: Only check alerts whose threshold is near the new price
Alert Matching Problem
Common mistake
Candidates often propose checking all alerts on every price update. Always calculate the math: 10M alerts x 100K updates = impossible. You need indexing.
Scale and Access Patterns
Let me quantify the scale and understand access patterns.
| Dimension | Value | Impact |
|---|---|---|
| Active alerts | 10 million | Need efficient storage and indexing |
| Price updates/sec | 100,000 | Streaming architecture required |
What to say
Key insight: alerts are partitioned by item. With 1M items and 10M alerts, average is 10 alerts per item. Most items have 0-5 alerts, some popular items have thousands.
Access Pattern Analysis:
- Write: Create alert: User creates alert for item X at threshold Y - Frequency: ~100K new alerts/day - Must update index structure
- Write: Price update: Item X price changes to Y - Frequency: 100K/sec - Must find matching alerts efficiently
- Read: User views alerts: User checks their active alerts - Frequency: ~1M/day - Secondary index by user_id
- Delete: Alert triggers: Remove alert after notification sent - Frequency: ~100K/day - Must update index atomically
Price update processing budget:
- 100K updates/sec = 10 microseconds per update
- If we can check 100 alerts in 10us, that is 1us per alert
- Modern CPU: ~1 billion ops/sec = 1000 ops per microsecond
- Comparison + branch = ~10 ops
- So 100 comparisons per update is feasible
This means: we can check ~100 alerts per price update
With 10 alerts average per item, this works!
For hot items (1000s of alerts), need range indexingHigh-Level Architecture
The architecture separates concerns: price ingestion, alert matching, and notification delivery.
What to say
I will design a streaming architecture with three main components: price ingestion, alert matching engine, and notification service. We scale by partitioning on item ID.
Price Alert System Architecture
Component Responsibilities:
1. Price Ingestion (Kafka) - Receives price updates from multiple sources - Partitions by item_id for ordered processing - Handles backpressure during spikes
2. Alert Matching Engine - Stateful service holding alert indexes in memory - Each instance owns a partition of items - Uses sorted data structures for efficient matching
3. Alert Storage - PostgreSQL: Durable storage, user queries - Redis: In-memory index for fast matching
4. Notification Service - Consumes triggered alerts from queue - Handles delivery across channels - Implements retry and deduplication
Real-world reference
CamelCamelCamel tracks 18M+ Amazon products. Robinhood processes millions of price alerts for stocks. Both use similar partitioned architectures.
Data Model and Storage
We need two storage systems: durable storage for alert definitions and in-memory index for fast matching.
What to say
PostgreSQL is the source of truth for alerts. Redis holds the inverted index optimized for matching. The matching engine rebuilds its index from PostgreSQL on startup.
CREATE TABLE alerts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id),Redis Index Structure:
For efficient matching, we use Redis Sorted Sets - one per item.
Key: alerts:below:{item_id}
Type: Sorted Set
Score: threshold_priceImportant detail
The sorted set query is O(log n + k) where k is matching alerts. For most updates, k is 0 or very small, making this extremely efficient.
Algorithm Deep Dive
The core algorithm matches price updates against alerts efficiently using range queries on sorted indexes.
class AlertMatcher:
def __init__(self, redis_client):
self.redis = redis_clientComplexity Analysis:
| Operation | Complexity | Notes | |-----------|------------|-------| | Add alert | O(log n) | Sorted set insertion | | Remove alert | O(log n) | Sorted set removal | | Match alerts | O(log n + k) | Range query, k = matches | | Price update total | O(log n + k) | Dominated by matching |
Alert Matching Flow
Optimization for hot items
For items with thousands of alerts (like AAPL), we can further optimize by tracking the min below threshold and max above threshold. If price is not near these boundaries, skip the Redis query entirely.
class OptimizedMatcher:
def __init__(self):
# Local cache of boundary thresholdsConsistency and Invariants
System Invariants
Never miss a triggered alert. If price crosses threshold, user must be notified. Duplicate notifications are preferable to missed notifications.
Consistency Challenges:
1. Index-Database Sync - Redis index must stay in sync with PostgreSQL - If alert created in DB but not in Redis: missed trigger - If alert in Redis but deleted from DB: orphan alert
async def create_alert(alert: AlertRequest) -> Alert:
# Start DB transaction
async with db.transaction():2. At-Least-Once Notification Delivery
We guarantee at-least-once delivery, handling duplicates on the receiving end.
async def process_triggered_alert(alert_id: str):
# Idempotency check - have we already processed this?
if await redis.sismember("processed_alerts", alert_id):Business impact mapping
Missing a price alert can mean a user misses a deal or trading opportunity - real financial impact. Duplicate notifications are annoying but not harmful. We optimize for never missing alerts.
Failure Modes and Resilience
Proactively discuss failures
Let me walk through failure scenarios. This system has several critical failure modes to handle.
| Failure | Impact | Mitigation | Recovery |
|---|---|---|---|
| Matcher node crash | Alerts for that partition not checked | Kafka consumer groups auto-rebalance | New node loads index from PostgreSQL |
| Redis down | Cannot check alerts | In-memory fallback with periodic DB sync | Rebuild index from PostgreSQL on recovery |
Matcher Recovery Process:
class AlertMatcher:
async def startup(self):
"""Initialize matcher on startup or recovery."""Handling Price Feed Gaps:
Stock markets close, APIs have outages. We need to handle gaps gracefully.
class PriceProcessor:
def __init__(self):
self.last_price = {} # item_id -> (price, timestamp)Key resilience principle
The PostgreSQL database is always the source of truth. Redis index can be rebuilt. Matchers can restart. The system recovers by reloading state from durable storage.
Evolution and Scaling
What to say
This design handles 10M alerts efficiently. Let me discuss how it evolves for 100M+ alerts and more complex alert conditions.
Evolution Path:
Stage 1: Single Region (current design) - 10M alerts, 100K updates/sec - Redis cluster for index - Works for most applications
Stage 2: Global Deployment - Replicate price feeds to each region - Regional matchers and notification - Lower latency for global users
Stage 3: Complex Alert Conditions - Percentage changes: "Alert when AAPL drops 5%" - Time-based: "Alert when AAPL hits $200 within market hours" - Multi-condition: "Alert when AAPL > 200 AND volume > 10M"
class ComplexAlertMatcher:
"""
For percentage-based alerts, we need to track reference prices.Multi-Region Architecture
Alternative Approaches:
| Approach | When to Use | |----------|-------------| | Redis Sorted Sets (current) | Simple thresholds, millions of alerts | | In-memory tree per matcher | Ultra-low latency, can fit in memory | | Apache Flink | Complex event processing, time windows | | TimescaleDB | Historical price analysis, time-series alerts |
Alternative approach
If we needed sub-second latency for stock trading, I would use an in-memory interval tree per matcher node, eliminating Redis network round trips. The tradeoff is more complex recovery and no shared state.
Monitoring and Observability:
Key metrics to track: - Alert matching latency (p50, p99) - Price update processing lag - Notification delivery latency - Alert trigger rate - Index size per item - Failed notification rate