System Design Masterclass
Social Mediaadvertisingreal-time-biddingtargetingauctionml-rankingadvanced

Design Ads Management & Delivery System

Design an ad serving system for social feeds at billions of impressions per day

Billions of impressions/day, millions of advertisers|Similar to Facebook, Google, Twitter, LinkedIn, TikTok, Snap|45 min read

Summary

Facebook and Google make billions from ads. The system must select the best ad from millions of candidates in under 100ms, while balancing advertiser ROI, user experience, and platform revenue. This tests auction systems, ML ranking, real-time aggregation, and distributed caching at massive scale.

Key Takeaways

Core Problem

This is a real-time auction system combined with ML ranking. Every feed request triggers an auction among millions of ads, decided in milliseconds.

The Hard Part

Balancing three competing objectives: advertiser ROI (they want conversions), user experience (relevance), and platform revenue (maximize earnings). Plus real-time budget enforcement.

Scaling Axis

Scale ad selection by user geography. Scale budget tracking by advertiser ID. Heavy caching at every layer.

Critical Invariant

Never overspend an advertiser budget. A 1% overspend across millions of advertisers means millions in refunds.

Performance Requirement

Ad selection must complete in under 100ms p99. Feed load cannot wait for slow ad decisions.

Key Tradeoff

We use eventual consistency for spend tracking (risk slight overspend) to avoid latency of distributed transactions on every impression.

Design Walkthrough

Problem Statement

The Question: Design an advertising system for a social media platform that serves billions of ad impressions per day.

Ad systems are the revenue engine of social platforms: - Facebook: $115B+ annual ad revenue - Google: $225B+ annual ad revenue - Twitter/X: $4B+ annual ad revenue

The system must: - Select the best ad for each user in real-time - Track budgets and prevent overspend - Measure and report ad performance - Handle millions of concurrent advertisers

What to say first

This is a complex system with multiple components. Let me clarify the scope - are we focusing on the ad serving pipeline, the campaign management system, or the full end-to-end? I will assume ad serving is the priority since that is the most technically challenging.

Hidden requirements interviewers test: - Do you understand auction mechanics (second-price, Vickrey)? - Can you design for competing objectives (revenue vs UX)? - Do you know ML ranking at scale? - Can you handle real-time budget enforcement? - Do you understand ad fraud and brand safety?

Clarifying Questions

These questions shape your architecture significantly.

Question 1: Scale

What is the scale? How many daily active users, ad impressions per day, and number of advertisers?

Why this matters: Determines caching strategy and system capacity. Typical answer: 500M DAU, 10B impressions/day, 10M advertisers Architecture impact: Cannot evaluate all ads for every request - need filtering and caching

Question 2: Ad Types

What types of ads? Feed ads, stories, search ads, display ads?

Why this matters: Different ad types have different serving patterns. Typical answer: Focus on feed ads (in-stream native ads) Architecture impact: Ads served as part of feed generation, must be fast

Question 3: Auction Model

What auction model? First-price, second-price, or VCG auction?

Why this matters: Affects bidding strategy and revenue. Typical answer: Generalized second-price (GSP) auction Architecture impact: Need to track clearing price, not just winning bid

Question 4: Billing Model

CPC (cost per click), CPM (cost per thousand impressions), or CPA (cost per action)?

Why this matters: Determines when budget is consumed. Typical answer: All three, but CPC most common Architecture impact: Need click tracking and conversion tracking systems

Stating assumptions

I will assume: 500M DAU, 10B impressions/day, feed ads with GSP auction, CPC billing primary. The system must select ads in under 100ms p99.

The Hard Part

Say this out loud

The hard part here is selecting the best ad from millions of candidates in under 100ms while balancing advertiser ROI, user experience, and platform revenue - all while enforcing budgets in real-time.

Why this is genuinely hard:

  1. 1.Scale of Selection: 10M ads, 500M users, need to find best match in milliseconds. Cannot score all ads for every request.
  2. 2.Competing Objectives: - Advertisers want conversions (show to likely buyers) - Users want relevance (not annoyed by irrelevant ads) - Platform wants revenue (maximize total ad spend) - These often conflict!
  3. 3.Real-time Budget Enforcement: If an advertiser has $1000 daily budget and we serve 1M impressions, we must track spend accurately without adding latency.
  4. 4.Cold Start Problem: New ads have no performance data. How do we rank them fairly against proven ads?
  5. 5.Fraud Detection: Click fraud, impression fraud, bot traffic - billions of dollars at stake.

Common mistakes

Candidates often: (1) Ignore the auction mechanism entirely, (2) Propose scoring all 10M ads for every request, (3) Forget about budget enforcement, (4) Skip fraud considerations.

The fundamental tradeoff:

Ad Rank = Bid x Quality Score x Relevance

  • High bid, low relevance = Bad UX, users leave - Low bid, high relevance = Less revenue - Balancing act that ML models optimize

Scale & Access Patterns

Let me quantify the scale and understand access patterns.

DimensionValueImpact
Daily Active Users500 million500M ad requests/day minimum
Impressions per Day10 billion~115K impressions/second average
+ 6 more rows...

What to say

At 10B impressions/day with 10M ads, we cannot evaluate every ad for every request. We need a funnel: broad filtering first, then detailed scoring on a smaller candidate set.

Access Pattern Analysis:

  • Read-heavy for ads: Ads change rarely, read millions of times - Write-heavy for events: Every impression, click is an event - Hot advertisers: Top 1% of advertisers drive 50%+ of spend - Temporal patterns: Budgets reset daily, traffic peaks predictably - Geographic locality: Users see ads targeted to their region
Ad serving throughput:
- 10B impressions / 86400 seconds = 115K/second average
- Peak = 5x = 575K impressions/second
+ 10 more lines...

High-Level Architecture

The ad system has three main subsystems: 1. Campaign Management: Advertisers create/manage campaigns 2. Ad Serving: Real-time ad selection for each request 3. Event Processing: Track impressions, clicks, conversions

What to say

I will focus on ad serving since it is the most technically challenging. The key insight is a funnel architecture: filter millions down to hundreds, then score and rank the top candidates.

Ad Serving Architecture

The Ad Selection Funnel:

  1. 1.Targeting Filter (10M to 100K ads, < 10ms) - Geographic targeting - Demographic targeting - Device/platform targeting - Budget > 0 check
  2. 2.Candidate Selection (100K to 1000 ads, < 20ms) - Lightweight relevance scoring - Diversity requirements - Frequency capping check
  3. 3.ML Ranking (1000 to 100 ads, < 50ms) - Deep learning model - Predicts CTR, conversion probability - User-ad feature interactions
  4. 4.Auction (100 to 10 ads, < 20ms) - Apply bids - Run GSP auction - Select winners

Real-world reference

Facebook uses a similar funnel. Their ML ranking model has 1000+ features and serves predictions in single-digit milliseconds using specialized inference servers.

Data Model & Storage

Multiple storage systems optimized for different access patterns.

What to say

We use different storage for different needs: fast key-value for real-time serving, OLAP for analytics, and a relational database for campaign management.

-- Advertiser accounts
CREATE TABLE advertisers (
    id BIGINT PRIMARY KEY,
+ 41 more lines...

Real-time Data Structures (Redis):

# Budget tracking (per advertiser)
budget:{advertiser_id}:daily_remaining -> INT (cents remaining)
budget:{advertiser_id}:daily_spent -> INT (cents spent today)
+ 18 more lines...

Feature Store (for ML ranking):

User Features (precomputed, refreshed hourly):
- user_id -> {
    age_bucket, gender, location,
+ 15 more lines...

Important detail

The ad index must be rebuilt when campaigns start/stop or budgets deplete. We use a background job that publishes updates to ad servers every few seconds.

Algorithm Deep Dive: Auction & Ranking

The core of ad selection is the auction combined with ML ranking.

Generalized Second-Price (GSP) Auction:

In GSP auction, the winner pays the price of the second-highest bidder (plus a small increment). This encourages truthful bidding.

def run_gsp_auction(candidates: List[AdCandidate], num_slots: int) -> List[AuctionResult]:
    """
    Run generalized second-price auction.
+ 30 more lines...

ML Ranking Model:

The quality score comes from ML models predicting engagement.

ML Ranking Pipeline

def calculate_quality_score(ad: Ad, user: User, context: Context) -> float:
    """
    Quality score balances multiple objectives.
+ 34 more lines...

Facebook approach

Facebook uses Total Value = Advertiser Bid x Estimated Action Rate x User Value. The User Value component penalizes low-quality or irrelevant ads to protect user experience.

Budget Enforcement & Pacing

System Invariant

Never significantly overspend an advertiser budget. Even 1% overspend across millions of advertisers means millions in refunds. Slight overspend (< 5%) acceptable due to distributed nature.

The Budget Challenge:

With 100M clicks/day and 10M advertisers: - Cannot check central database for every click - Cannot lock budgets for every impression - Must handle eventual consistency gracefully

class BudgetManager:
    def __init__(self):
        self.local_cache = {}  # Local budget cache
+ 44 more lines...

Budget Pacing:

Advertisers want to spread budget across the day, not spend it all in the morning.

def calculate_pacing_multiplier(campaign_id: str) -> float:
    """
    Adjust bid based on spend rate vs target rate.
+ 28 more lines...

What to say

Budget enforcement uses optimistic tracking with eventual consistency. We debit budgets asynchronously after impressions, accepting small overspend risk for latency. Reconciliation runs hourly to catch discrepancies.

Consistency & Invariants

Key Invariants:

InvariantImpact if ViolatedEnforcement
No significant budget overspendMust refund advertisersAtomic decrements + buffer
No ads shown after campaign endsWasted spend, angry advertisersTTL on ad index entries
Frequency caps respectedUser annoyance, lower CTRRedis counters per user-ad
Brand safety maintainedPR disaster, legal issuesPre-filtering + real-time checks
Click attribution accurateWrong billingClick ID tracking, deduplication

Critical invariant detail

Budget tracking must be monotonic. If we show an ad, we MUST debit the budget. Lost debits mean overspend. We use write-ahead logging: log the impression before serving, then debit budget.

Consistency Model:

We use different consistency levels for different data:

DataConsistencyReason
Ad creative contentEventually consistentChanges rare, caching OK
Budget remainingEventual with boundsSpeed critical, small overrun OK
Frequency cap countsEventually consistentApproximate OK
Click/conversion eventsAt-least-onceMay double-count, reconcile later
Billing recordsStrongly consistentMoney - must be accurate

Business impact mapping

If we show an ad to a user who hit frequency cap, they see one extra ad - minor annoyance. If we overspend a budget by 50%, we owe that advertiser money. Consistency requirements follow business impact.

Failure Modes & Resilience

Proactively discuss failures

Ad systems handle billions of dollars. Let me walk through failure modes and mitigations.

FailureImpactMitigationWhy It Works
ML model serving downCannot rank adsFallback to simple rulesBetter to show suboptimal ads than no ads
Redis budget cache downCannot track budgetsLocal estimation + pause high spendersConservative approach prevents overspend
+ 4 more rows...

Graceful Degradation Strategy:

def select_ads(user: User, context: Context) -> List[Ad]:
    """
    Fallback chain for ad selection.
+ 27 more lines...

Key insight

For ad systems, fail open is usually correct. No ads is better than broken page. Revenue loss from downtime is less than user trust loss from broken experience.

Evolution & Scaling

What to say

This design handles 10B impressions/day. Let me discuss how it evolves for 10x scale and additional capabilities.

Evolution Path:

Stage 1: Basic System (up to 1B impressions/day) - Single ML model - Centralized Redis for budgets - Simple rule-based candidate filtering

Stage 2: Scaled System (up to 10B impressions/day) - Distributed ad index - Sharded budget tracking - Cascade ML ranking (simple model -> complex model) - Regional deployment

Stage 3: Advanced System (100B+ impressions/day) - Real-time model training - Federated budget tracking - Edge-based ad selection - Cross-device attribution

Multi-Region Architecture

Advanced Topics to Mention:

  1. 1.Real-Time Bidding (RTB): For ad exchanges, auction happens across multiple ad networks in 100ms.
  2. 2.Conversion Attribution: Users see ad on mobile, buy on desktop days later. Multi-touch attribution is complex.
  3. 3.Fraud Detection: Click farms, bot traffic, impression fraud. ML models detect anomalies.
  4. 4.Privacy Compliance: GDPR, CCPA affect targeting. Need consent management, data retention policies.
  5. 5.A/B Testing: Test ranking models, auction parameters, UI changes. Need holdout groups and statistical rigor.

Alternative approaches

If latency requirements were stricter, we could pre-compute ad selections for user segments and cache them. If budget accuracy were critical, we could use distributed transactions at the cost of latency. The design always trades off competing concerns.

Design Trade-offs

Advantages

  • +Simple to implement
  • +Predictable behavior
  • +Easy to debug

Disadvantages

  • -Suboptimal performance
  • -Cannot handle complex patterns
  • -Manual tuning needed
When to use

Early stage, fallback system, simple use cases