Ads Management & Delivery System

Design Walkthrough

Problem Statement

The Question: Design an advertising system for a social media platform that serves billions of ad impressions per day.

Ad systems are the revenue engine of social platforms: - Facebook: $115B+ annual ad revenue - Google: $225B+ annual ad revenue - Twitter/X: $4B+ annual ad revenue

The system must: - Select the best ad for each user in real-time - Track budgets and prevent overspend - Measure and report ad performance - Handle millions of concurrent advertisers

What to say first

This is a complex system with multiple components. Let me clarify the scope - are we focusing on the ad serving pipeline, the campaign management system, or the full end-to-end? I will assume ad serving is the priority since that is the most technically challenging.

Hidden requirements interviewers test: - Do you understand auction mechanics (second-price, Vickrey)? - Can you design for competing objectives (revenue vs UX)? - Do you know ML ranking at scale? - Can you handle real-time budget enforcement? - Do you understand ad fraud and brand safety?

Clarifying Questions

These questions shape your architecture significantly.

Question 1: Scale

What is the scale? How many daily active users, ad impressions per day, and number of advertisers?

Why this matters: Determines caching strategy and system capacity. Typical answer: 500M DAU, 10B impressions/day, 10M advertisers Architecture impact: Cannot evaluate all ads for every request - need filtering and caching

Question 2: Ad Types

What types of ads? Feed ads, stories, search ads, display ads?

Why this matters: Different ad types have different serving patterns. Typical answer: Focus on feed ads (in-stream native ads) Architecture impact: Ads served as part of feed generation, must be fast

Question 3: Auction Model

What auction model? First-price, second-price, or VCG auction?

Why this matters: Affects bidding strategy and revenue. Typical answer: Generalized second-price (GSP) auction Architecture impact: Need to track clearing price, not just winning bid

Question 4: Billing Model

CPC (cost per click), CPM (cost per thousand impressions), or CPA (cost per action)?

Why this matters: Determines when budget is consumed. Typical answer: All three, but CPC most common Architecture impact: Need click tracking and conversion tracking systems

Stating assumptions

I will assume: 500M DAU, 10B impressions/day, feed ads with GSP auction, CPC billing primary. The system must select ads in under 100ms p99.

The Hard Part

Say this out loud

The hard part here is selecting the best ad from millions of candidates in under 100ms while balancing advertiser ROI, user experience, and platform revenue - all while enforcing budgets in real-time.

Why this is genuinely hard:

1.Scale of Selection: 10M ads, 500M users, need to find best match in milliseconds. Cannot score all ads for every request.
2.Competing Objectives: - Advertisers want conversions (show to likely buyers) - Users want relevance (not annoyed by irrelevant ads) - Platform wants revenue (maximize total ad spend) - These often conflict!
3.Real-time Budget Enforcement: If an advertiser has $1000 daily budget and we serve 1M impressions, we must track spend accurately without adding latency.
4.Cold Start Problem: New ads have no performance data. How do we rank them fairly against proven ads?
5.Fraud Detection: Click fraud, impression fraud, bot traffic - billions of dollars at stake.

Common mistakes

Candidates often: (1) Ignore the auction mechanism entirely, (2) Propose scoring all 10M ads for every request, (3) Forget about budget enforcement, (4) Skip fraud considerations.

The fundamental tradeoff:

Ad Rank = Bid x Quality Score x Relevance

High bid, low relevance = Bad UX, users leave - Low bid, high relevance = Less revenue - Balancing act that ML models optimize

Scale & Access Patterns

Let me quantify the scale and understand access patterns.

Dimension	Value	Impact
Daily Active Users	500 million	500M ad requests/day minimum
Impressions per Day	10 billion	~115K impressions/second average

+ 6 more rows...

What to say

At 10B impressions/day with 10M ads, we cannot evaluate every ad for every request. We need a funnel: broad filtering first, then detailed scoring on a smaller candidate set.

Access Pattern Analysis:

Read-heavy for ads: Ads change rarely, read millions of times - Write-heavy for events: Every impression, click is an event - Hot advertisers: Top 1% of advertisers drive 50%+ of spend - Temporal patterns: Budgets reset daily, traffic peaks predictably - Geographic locality: Users see ads targeted to their region

Ad serving throughput:
- 10B impressions / 86400 seconds = 115K/second average
- Peak = 5x = 575K impressions/second

+ 10 more lines...

High-Level Architecture

The ad system has three main subsystems: 1. Campaign Management: Advertisers create/manage campaigns 2. Ad Serving: Real-time ad selection for each request 3. Event Processing: Track impressions, clicks, conversions

What to say

I will focus on ad serving since it is the most technically challenging. The key insight is a funnel architecture: filter millions down to hundreds, then score and rank the top candidates.

Ad Serving Architecture

The Ad Selection Funnel:

1.Targeting Filter (10M to 100K ads, < 10ms) - Geographic targeting - Demographic targeting - Device/platform targeting - Budget > 0 check
2.Candidate Selection (100K to 1000 ads, < 20ms) - Lightweight relevance scoring - Diversity requirements - Frequency capping check
3.ML Ranking (1000 to 100 ads, < 50ms) - Deep learning model - Predicts CTR, conversion probability - User-ad feature interactions
4.Auction (100 to 10 ads, < 20ms) - Apply bids - Run GSP auction - Select winners

Real-world reference

Facebook uses a similar funnel. Their ML ranking model has 1000+ features and serves predictions in single-digit milliseconds using specialized inference servers.

Data Model & Storage

Multiple storage systems optimized for different access patterns.

What to say

We use different storage for different needs: fast key-value for real-time serving, OLAP for analytics, and a relational database for campaign management.

-- Advertiser accounts
CREATE TABLE advertisers (
    id BIGINT PRIMARY KEY,

+ 41 more lines...

Real-time Data Structures (Redis):

# Budget tracking (per advertiser)
budget:{advertiser_id}:daily_remaining -> INT (cents remaining)
budget:{advertiser_id}:daily_spent -> INT (cents spent today)

+ 18 more lines...

Feature Store (for ML ranking):

User Features (precomputed, refreshed hourly):
- user_id -> {
    age_bucket, gender, location,

+ 15 more lines...

Important detail

The ad index must be rebuilt when campaigns start/stop or budgets deplete. We use a background job that publishes updates to ad servers every few seconds.

Algorithm Deep Dive: Auction & Ranking

The core of ad selection is the auction combined with ML ranking.

Generalized Second-Price (GSP) Auction:

In GSP auction, the winner pays the price of the second-highest bidder (plus a small increment). This encourages truthful bidding.

def run_gsp_auction(candidates: List[AdCandidate], num_slots: int) -> List[AuctionResult]:
    """
    Run generalized second-price auction.

+ 30 more lines...

ML Ranking Model:

The quality score comes from ML models predicting engagement.

ML Ranking Pipeline

def calculate_quality_score(ad: Ad, user: User, context: Context) -> float:
    """
    Quality score balances multiple objectives.

+ 34 more lines...

Facebook approach

Facebook uses Total Value = Advertiser Bid x Estimated Action Rate x User Value. The User Value component penalizes low-quality or irrelevant ads to protect user experience.

Budget Enforcement & Pacing

System Invariant

Never significantly overspend an advertiser budget. Even 1% overspend across millions of advertisers means millions in refunds. Slight overspend (< 5%) acceptable due to distributed nature.

The Budget Challenge:

With 100M clicks/day and 10M advertisers: - Cannot check central database for every click - Cannot lock budgets for every impression - Must handle eventual consistency gracefully

class BudgetManager:
    def __init__(self):
        self.local_cache = {}  # Local budget cache

+ 44 more lines...

Budget Pacing:

Advertisers want to spread budget across the day, not spend it all in the morning.

def calculate_pacing_multiplier(campaign_id: str) -> float:
    """
    Adjust bid based on spend rate vs target rate.

+ 28 more lines...

What to say

Budget enforcement uses optimistic tracking with eventual consistency. We debit budgets asynchronously after impressions, accepting small overspend risk for latency. Reconciliation runs hourly to catch discrepancies.

Consistency & Invariants

Key Invariants:

Invariant	Impact if Violated	Enforcement
No significant budget overspend	Must refund advertisers	Atomic decrements + buffer
No ads shown after campaign ends	Wasted spend, angry advertisers	TTL on ad index entries
Frequency caps respected	User annoyance, lower CTR	Redis counters per user-ad
Brand safety maintained	PR disaster, legal issues	Pre-filtering + real-time checks
Click attribution accurate	Wrong billing	Click ID tracking, deduplication

Critical invariant detail

Budget tracking must be monotonic. If we show an ad, we MUST debit the budget. Lost debits mean overspend. We use write-ahead logging: log the impression before serving, then debit budget.

Consistency Model:

We use different consistency levels for different data:

Data	Consistency	Reason
Ad creative content	Eventually consistent	Changes rare, caching OK
Budget remaining	Eventual with bounds	Speed critical, small overrun OK
Frequency cap counts	Eventually consistent	Approximate OK
Click/conversion events	At-least-once	May double-count, reconcile later
Billing records	Strongly consistent	Money - must be accurate

Business impact mapping

If we show an ad to a user who hit frequency cap, they see one extra ad - minor annoyance. If we overspend a budget by 50%, we owe that advertiser money. Consistency requirements follow business impact.

Failure Modes & Resilience

Proactively discuss failures

Ad systems handle billions of dollars. Let me walk through failure modes and mitigations.

Failure	Impact	Mitigation	Why It Works
ML model serving down	Cannot rank ads	Fallback to simple rules	Better to show suboptimal ads than no ads
Redis budget cache down	Cannot track budgets	Local estimation + pause high spenders	Conservative approach prevents overspend

+ 4 more rows...

Graceful Degradation Strategy:

def select_ads(user: User, context: Context) -> List[Ad]:
    """
    Fallback chain for ad selection.

+ 27 more lines...

Key insight

For ad systems, fail open is usually correct. No ads is better than broken page. Revenue loss from downtime is less than user trust loss from broken experience.

Evolution & Scaling

What to say

This design handles 10B impressions/day. Let me discuss how it evolves for 10x scale and additional capabilities.

Evolution Path:

Stage 1: Basic System (up to 1B impressions/day) - Single ML model - Centralized Redis for budgets - Simple rule-based candidate filtering

Stage 2: Scaled System (up to 10B impressions/day) - Distributed ad index - Sharded budget tracking - Cascade ML ranking (simple model -> complex model) - Regional deployment

Stage 3: Advanced System (100B+ impressions/day) - Real-time model training - Federated budget tracking - Edge-based ad selection - Cross-device attribution

Multi-Region Architecture

Advanced Topics to Mention:

1.Real-Time Bidding (RTB): For ad exchanges, auction happens across multiple ad networks in 100ms.
2.Conversion Attribution: Users see ad on mobile, buy on desktop days later. Multi-touch attribution is complex.
3.Fraud Detection: Click farms, bot traffic, impression fraud. ML models detect anomalies.
4.Privacy Compliance: GDPR, CCPA affect targeting. Need consent management, data retention policies.
5.A/B Testing: Test ranking models, auction parameters, UI changes. Need holdout groups and statistical rigor.

Alternative approaches

If latency requirements were stricter, we could pre-compute ad selections for user segments and cache them. If budget accuracy were critical, we could use distributed transactions at the cost of latency. The design always trades off competing concerns.

System Design Masterclass

Weather Application with Forecasting

URL Shortener

Live Comments Feature

API Rate Limiter

On-Call Escalation System

Hotel Booking and Reservation System

Parts Compatibility Validation

Real-time Stock Price Viewer

Top-K Rankings System

File Download and Sync Library

Real-time Active Viewers

Marketplace Features

Price Alert System

Netflix Screen Concurrency Limits

Live Reactions System

Top K Most Shared Articles

High-Profile Likes Counter

Authentication and User Login

Google Calendar

Web Crawler

News Feed

Video Streaming Platform

IoC / Dependency Injection Framework

Distributed Control Infrastructure

Notification Service

Distributed Tracing System

P2P File Transfer System

Large Data Migration to Cloud

Wire Transfer API

Large Data Sorting and Processing

Database Control Plane

Distributed Metrics Logging and Aggregation