Design a URL Shortener: System Design Interview Answer

"Design a URL shortener like bit.ly."

This is the most common system design interview question. It appears in interviews at Google, Amazon, Meta, and nearly every other tech company.

Why? Because it's deceptively simple. Anyone can describe the basic functionality. But a strong candidate reveals layers of depth: handling billions of URLs, sub-millisecond redirects, analytics at scale, and graceful failure handling.

This guide walks through exactly how to answer this question, from the clarifying questions that impress interviewers to the deep dives that separate strong from average candidates.

What the Interviewer Is Evaluating

Before diving in, understand what they're looking for:

Requirement gathering , Do you ask questions, or dive in blindly?
Scale estimation , Can you do back-of-envelope math?
Database choices , Do you understand when to use SQL vs. NoSQL?
API design , Can you design clean, RESTful endpoints?
Algorithm knowledge , Do you know hashing, encoding, collision handling?
Trade-off discussion , Can you articulate why you made choices?

The question is simple enough that weak candidates can sketch something. The depth you demonstrate determines whether you pass.

Step 1: Clarify Requirements

Never start designing immediately. Ask questions that show you think before coding.

Functional Requirements

Questions to ask:

"Let me clarify the functional requirements. What features should the URL shortener support?"

Typical features:

Create short URL , Given a long URL, generate a short one
Redirect , When someone visits the short URL, redirect to the original
Custom aliases , Can users specify their own short URL? (e.g., bit.ly/my-link)
Expiration , Should URLs expire after a certain time?
Analytics , Do we need click tracking, geographic data, referrer info?

For this design, assume:

Create and redirect (core features)
Optional custom aliases
Basic analytics (click count)
URLs don't expire by default (user can set expiration)

Non-Functional Requirements

Questions to ask:

"What scale are we designing for? How many URLs per day, and what's our read/write ratio?"

Key questions:

How many URLs shortened per day?
How many redirects per day?
What's the acceptable latency for redirects?
How long should URLs persist?
What's our availability target?

Typical answers:

100 million URLs created per day
10 billion redirects per day (100:1 read/write ratio)
Redirect latency < 100ms
URLs stored for 5 years by default
99.9% availability

Step 2: Estimate Scale

Do the math out loud. Interviewers want to see your reasoning.

Storage Estimation

URLs per day: 100 million
URLs per year: 100M × 365 = 36.5 billion
URLs over 5 years: 36.5B × 5 = 182.5 billion URLs

Average URL length: ~100 characters = 100 bytes
Short URL: 7 characters = 7 bytes
Metadata (created_at, user_id, etc.): ~50 bytes

Total per URL: ~157 bytes
Round up to: 200 bytes (for indexes, overhead)

Storage for 5 years: 182.5B × 200 bytes = 36.5 TB

36 TB is manageable. A single machine could store this, but we'll distribute for availability.

Throughput Estimation

Writes (URL creation):
100M per day = 100M / 86,400 seconds ≈ 1,160 URLs/second

Reads (redirects):
10B per day = 10B / 86,400 seconds ≈ 115,000 requests/second

115,000 QPS for reads is significant. We'll need caching.

Short URL Length

How long should the short code be?

Using base62 (a-z, A-Z, 0-9): 62 characters

6 characters: 62^6 = 56.8 billion combinations
7 characters: 62^7 = 3.5 trillion combinations

With 182.5 billion URLs over 5 years, 7 characters gives us plenty of room.

Conclusion: Use 7-character codes with base62 encoding.

Step 3: High-Level Design

Draw the architecture:

                                    ┌─────────────┐
                                    │   CDN/DNS   │
                                    └──────┬──────┘
                                           │
┌──────────┐     ┌──────────────┐    ┌─────▼─────┐
│  Client  │────▶│ Load Balancer│───▶│API Servers│
└──────────┘     └──────────────┘    └─────┬─────┘
                                           │
                      ┌────────────────────┼────────────────────┐
                      │                    │                    │
                ┌─────▼─────┐        ┌─────▼─────┐        ┌─────▼─────┐
                │   Cache   │        │  Database │        │ Analytics │
                │  (Redis)  │        │(PostgreSQL)│       │  (Kafka)  │
                └───────────┘        └───────────┘        └───────────┘

Components

1. Load Balancer

Distributes traffic across API servers
Health checks, SSL termination
Options: AWS ALB, Nginx, HAProxy

2. API Servers (Stateless)

Handle URL creation and redirect logic
Horizontally scalable
No local state, all state in database/cache

3. Cache (Redis)

Store frequently accessed short-to-long URL mappings
115K QPS requires caching, can't hit database for every redirect
Cache hit rate target: 80%+

4. Database

Persistent storage for all URL mappings
PostgreSQL for ACID guarantees, or Cassandra for higher write throughput

5. Analytics Pipeline

Async processing for click tracking
Kafka for event streaming
Batch processing for analytics dashboards

Step 4: API Design

Define the endpoints:

Create Short URL

POST /api/v1/urls
Content-Type: application/json

Request:
{
  "long_url": "https://example.com/very/long/path?with=params",
  "custom_alias": "my-link",      // optional
  "expires_at": "2026-12-31"      // optional
}

Response (201 Created):
{
  "short_url": "https://short.url/abc1234",
  "long_url": "https://example.com/very/long/path?with=params",
  "created_at": "2026-01-05T10:30:00Z",
  "expires_at": "2026-12-31T00:00:00Z"
}

Redirect

GET /{short_code}

Response: 301 Redirect to long_url

Headers:
Location: https://example.com/very/long/path?with=params

Why 301 vs 302?

Code	Type	Browser Behavior	Use Case
301	Permanent	Browser caches, fewer server hits	When URL mapping never changes
302	Temporary	Browser doesn't cache, always hits server	When you need analytics or URL might change

For analytics, use 302 so every click hits your server.

Get URL Stats

GET /api/v1/urls/{short_code}/stats

Response:
{
  "short_url": "https://short.url/abc1234",
  "long_url": "https://example.com/...",
  "click_count": 15234,
  "created_at": "2026-01-05T10:30:00Z",
  "clicks_by_country": {
    "US": 8000,
    "UK": 3000,
    ...
  }
}

Step 5: Database Schema

Option A: Relational (PostgreSQL)

CREATE TABLE urls (
  id BIGSERIAL PRIMARY KEY,
  short_code VARCHAR(10) UNIQUE NOT NULL,
  long_url TEXT NOT NULL,
  user_id BIGINT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  expires_at TIMESTAMP,
  click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_user_id ON urls(user_id);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;

Pros: ACID transactions, flexible queries, mature tooling Cons: Harder to scale horizontally

Option B: NoSQL (Cassandra/DynamoDB)

Table: urls
Partition Key: short_code
Columns: long_url, user_id, created_at, expires_at, click_count

Pros: Horizontal scaling, high write throughput Cons: Limited query flexibility, eventual consistency

Which to Choose?

For a URL shortener, either works. The data model is simple (key-value lookup).

Choose PostgreSQL if:

You need complex queries (analytics by user, date ranges)
Team is familiar with SQL
Scale is moderate (< 100K writes/second)

Choose Cassandra/DynamoDB if:

Write throughput is critical
You need linear horizontal scaling
Simple access patterns (lookup by short_code)

For this design, I'll use PostgreSQL with read replicas for simplicity, with the option to migrate to Cassandra if write throughput becomes a bottleneck.

Step 6: Short Code Generation

This is where candidates differentiate themselves. There are three main approaches:

Approach 1: Hash the Long URL

import hashlib
import base64

def generate_short_code(long_url):
    hash_bytes = hashlib.md5(long_url.encode()).digest()
    base64_str = base64.urlsafe_b64encode(hash_bytes).decode()
    return base64_str[:7]  # Take first 7 characters

Pros:

Same URL always produces same short code (deduplication)
No coordination needed between servers

Cons:

Collisions possible (different URLs, same hash prefix)
Need collision handling logic

Collision handling:

def create_short_url(long_url):
    short_code = generate_short_code(long_url)

    for attempt in range(5):
        if not exists_in_db(short_code):
            save_to_db(short_code, long_url)
            return short_code
        # Collision: append attempt number and rehash
        short_code = generate_short_code(long_url + str(attempt))

    raise Exception("Could not generate unique short code")

Approach 2: Counter-Based (Auto-Increment)

def generate_short_code(counter_value):
    # Convert counter to base62
    characters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
    result = []

    while counter_value > 0:
        result.append(characters[counter_value % 62])
        counter_value //= 62

    return ''.join(reversed(result)).zfill(7)

# Counter: 1 → "0000001"
# Counter: 1000000 → "4c92"

Pros:

No collisions (counter is unique)
Simple to implement

Cons:

Predictable URLs (security concern)
Single counter = bottleneck
Need distributed counter for horizontal scaling

Approach 3: Pre-Generated Keys (Key Generation Service)

┌─────────────────┐
│ Key Generation  │
│    Service      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Key Database   │
│ (unused keys)   │
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌──────┐  ┌──────┐
│API 1 │  │API 2 │
│(keys)│  │(keys)│
└──────┘  └──────┘

How it works:

Background service pre-generates millions of unique keys
Stores them in a database table with "used" flag
API servers fetch batches of keys (e.g., 1000 at a time)
When a key is used, mark it as used

class KeyService:
    def __init__(self):
        self.local_keys = []

    def get_key(self):
        if not self.local_keys:
            # Fetch batch from key database
            self.local_keys = fetch_unused_keys(batch_size=1000)
        return self.local_keys.pop()

Pros:

No collisions
No coordination during URL creation (keys pre-assigned)
Not predictable (keys can be shuffled)

Cons:

More complex infrastructure
Keys in failed requests are "wasted"

Recommendation

For interviews, explain Approach 3 (Pre-Generated Keys). It shows you understand:

How to avoid coordination bottlenecks
How to handle distributed systems
Trade-offs between complexity and performance

Step 7: Caching Strategy

With 115K redirects/second, caching is essential.

What to Cache

Key: short_code
Value: long_url

Example:
"abc1234" → "https://example.com/long/path"

Cache Eviction

Use LRU (Least Recently Used) eviction:

Popular URLs stay in cache
Rarely accessed URLs get evicted

Cache Size Calculation

Assume 20% of URLs get 80% of traffic (Pareto principle)
Hot URLs: 182.5B × 20% = 36.5B URLs

But we only need to cache recently accessed ones.
If we cache URLs accessed in the last day:
100M URLs × 200 bytes = 20 GB

20 GB easily fits in memory. Add buffer → 64 GB cache.

Cache-Aside Pattern

def redirect(short_code):
    # 1. Check cache
    long_url = cache.get(short_code)

    if long_url:
        return redirect_to(long_url)

    # 2. Cache miss: check database
    long_url = db.get(short_code)

    if not long_url:
        return 404

    # 3. Update cache
    cache.set(short_code, long_url, ttl=3600)

    return redirect_to(long_url)

Handling Hot Keys

What if one URL goes viral and gets 1 million requests/second?

Problem: Single cache key becomes bottleneck.

Solutions:

Replicate hot keys across cache nodes
- Identify hot keys (>10K requests/second)
- Replicate to multiple cache servers
- Client randomly selects which replica to query
Local caching on API servers
- In-memory cache (e.g., Guava Cache) on each API server
- Very short TTL (30 seconds)
- Reduces load on Redis

Step 8: Analytics Deep Dive

If the interviewer asks about analytics, here's how to handle it:

Click Tracking Architecture

┌────────┐     ┌────────────┐     ┌─────────┐     ┌────────────┐
│Redirect│────▶│   Kafka    │────▶│ Spark/  │────▶│ Analytics  │
│  API   │     │  (Events)  │     │ Flink   │     │   DB       │
└────────┘     └────────────┘     └─────────┘     └────────────┘

Event Schema

{
  "short_code": "abc1234",
  "timestamp": "2026-01-05T10:30:00Z",
  "ip_address": "192.168.1.1",
  "user_agent": "Mozilla/5.0...",
  "referrer": "https://twitter.com/...",
  "country": "US",
  "city": "San Francisco"
}

Why Async?

Redirect latency must be < 100ms. We can't wait for analytics writes.

Solution: Fire-and-forget to Kafka, process asynchronously.

def redirect(short_code):
    long_url = get_long_url(short_code)

    # Async: don't block the redirect
    kafka.send_async("clicks", {
        "short_code": short_code,
        "timestamp": now(),
        "ip": request.ip,
        ...
    })

    return redirect_to(long_url)

Step 9: Handle Edge Cases

Strong candidates proactively address edge cases:

1. What if the database goes down?

Answer: Read replicas + cache keeps redirects working. Writes fail gracefully with queuing.

def create_short_url(long_url):
    try:
        return save_to_primary_db(long_url)
    except DatabaseUnavailable:
        # Queue for retry
        queue.send("pending_urls", long_url)
        return "URL creation delayed, check back soon"

2. What about duplicate long URLs?

Options:

Allow duplicates: Same URL can have multiple short codes
Deduplicate: Check if URL exists, return existing short code

Trade-off: Deduplication saves storage but requires extra lookup on every create.

3. How do you handle expired URLs?

# Background job runs every hour
def cleanup_expired_urls():
    expired = db.query(
        "SELECT short_code FROM urls WHERE expires_at < NOW()"
    )

    for url in expired:
        db.delete(url.short_code)
        cache.delete(url.short_code)

4. What about security/abuse?

Rate limiting: Max 100 URLs per user per hour
URL validation: Check for malware, phishing
Blacklisting: Block known malicious domains

Common Follow-Up Questions

"How would you scale this to 10x traffic?"

Add more API servers (stateless, easy to scale)
Increase cache cluster size
Add database read replicas
Consider sharding database by short_code prefix

"How would you implement custom aliases?"

def create_with_alias(long_url, alias):
    if not is_valid_alias(alias):
        return error("Invalid alias format")

    if exists_in_db(alias):
        return error("Alias already taken")

    save_to_db(alias, long_url)
    return success(alias)

"301 vs 302 redirect?"

301 (Permanent): Browser caches, less server load, worse analytics
302 (Temporary): Every click hits server, better analytics, more load

"How do you ensure high availability?"

Multiple data centers
Database replication (primary + replicas)
Cache replication
Load balancer failover
Health checks and auto-scaling

Summary: The Winning Answer

In a 45-minute interview, hit these points:

Time	What to Cover
0-5 min	Clarify requirements (features, scale, latency)
5-10 min	Estimate scale (storage, QPS, short code length)
10-25 min	High-level design (draw components, explain data flow)
25-40 min	Deep dive (short code generation, caching, analytics)
40-45 min	Trade-offs and edge cases

The key differentiators:

Ask clarifying questions before designing
Do the math to justify your choices
Explain trade-offs for every decision
Proactively address edge cases
Communicate clearly throughout---

Frequently Asked Questions

How is this different from a key-value store design?

A URL shortener is essentially a specialized key-value store. But the interview focuses on: short code generation algorithms, analytics requirements, and the extremely high read-to-write ratio.

Should I mention specific technologies?

Yes, but justify them. "I'd use Redis for caching because of its sub-millisecond latency and built-in LRU eviction" is better than just "I'd use Redis."

What if I'm asked about a feature I haven't considered?

That's fine. Think out loud: "I hadn't considered that. Let me think through how we'd approach it..." Interviewers want to see your problem-solving process.

How deep should I go on the database schema?

Show you understand indexing and query patterns. You don't need to design a normalized schema with every constraint, focus on the critical fields and indexes.

Design a URL Shortener: System Design Interview Answer

Ready to Master System Design Interviews?

What the Interviewer Is Evaluating

Step 1: Clarify Requirements

Functional Requirements

Non-Functional Requirements

Step 2: Estimate Scale

Storage Estimation

Throughput Estimation

Short URL Length

Step 3: High-Level Design

Components

Step 4: API Design

Create Short URL

Redirect

Get URL Stats

Step 5: Database Schema

Option A: Relational (PostgreSQL)

Option B: NoSQL (Cassandra/DynamoDB)

Which to Choose?

Step 6: Short Code Generation

Approach 1: Hash the Long URL

Approach 2: Counter-Based (Auto-Increment)

Approach 3: Pre-Generated Keys (Key Generation Service)

Recommendation

Step 7: Caching Strategy

What to Cache

Cache Eviction

Cache Size Calculation

Cache-Aside Pattern

Handling Hot Keys

Step 8: Analytics Deep Dive

Click Tracking Architecture

Event Schema

Why Async?

Step 9: Handle Edge Cases

1. What if the database goes down?

2. What about duplicate long URLs?

3. How do you handle expired URLs?

4. What about security/abuse?

Common Follow-Up Questions

"How would you scale this to 10x traffic?"

"How would you implement custom aliases?"

"301 vs 302 redirect?"

"How do you ensure high availability?"

Summary: The Winning Answer

Frequently Asked Questions

How is this different from a key-value store design?

Should I mention specific technologies?

What if I'm asked about a feature I haven't considered?

How deep should I go on the database schema?

Ready to Master System Design Interviews?

FREE: System Design Interview Cheat Sheet

Related Articles

Why Distributed Systems Fail: 15 Failure Scenarios Every Engineer Must Know

The 7 System Design Problems You Must Know Before Your Interview

Amazon System Design Interview: Leadership Principles Meet Distributed Systems