Back to Blog
system-designurl-shortenerinterview

Design a URL Shortener: System Design Interview Answer

Complete guide to designing a URL shortener like bit.ly. Covers requirements, scale calculations, database design, and common follow-up questions.

12 min readBy SystemExperts
From the Interviewer's Side

Ready to Master System Design Interviews?

Learn from 25+ real interview problems from Netflix, Uber, Google, and Stripe. Created by a senior engineer who's taken 200+ system design interviews at FAANG companies.

Complete Solutions

Architecture diagrams & trade-off analysis

Real Interview Problems

From actual FAANG interviews

7-day money-back guarantee • Lifetime access • New problems added quarterly

"Design a URL shortener like bit.ly."

This is the most common system design interview question. It appears in interviews at Google, Amazon, Meta, and nearly every other tech company.

Why? Because it's deceptively simple. Anyone can describe the basic functionality. But a strong candidate reveals layers of depth: handling billions of URLs, sub-millisecond redirects, analytics at scale, and graceful failure handling.

This guide walks through exactly how to answer this question, from the clarifying questions that impress interviewers to the deep dives that separate strong from average candidates.


What the Interviewer Is Evaluating

Before diving in, understand what they're looking for:

  1. Requirement gathering , Do you ask questions, or dive in blindly?
  2. Scale estimation , Can you do back-of-envelope math?
  3. Database choices , Do you understand when to use SQL vs. NoSQL?
  4. API design , Can you design clean, RESTful endpoints?
  5. Algorithm knowledge , Do you know hashing, encoding, collision handling?
  6. Trade-off discussion , Can you articulate why you made choices?

The question is simple enough that weak candidates can sketch something. The depth you demonstrate determines whether you pass.


Step 1: Clarify Requirements

Never start designing immediately. Ask questions that show you think before coding.

Functional Requirements

Questions to ask:

"Let me clarify the functional requirements. What features should the URL shortener support?"

Typical features:

  • Create short URL , Given a long URL, generate a short one
  • Redirect , When someone visits the short URL, redirect to the original
  • Custom aliases , Can users specify their own short URL? (e.g., bit.ly/my-link)
  • Expiration , Should URLs expire after a certain time?
  • Analytics , Do we need click tracking, geographic data, referrer info?

For this design, assume:

  • Create and redirect (core features)
  • Optional custom aliases
  • Basic analytics (click count)
  • URLs don't expire by default (user can set expiration)

Non-Functional Requirements

Questions to ask:

"What scale are we designing for? How many URLs per day, and what's our read/write ratio?"

Key questions:

  • How many URLs shortened per day?
  • How many redirects per day?
  • What's the acceptable latency for redirects?
  • How long should URLs persist?
  • What's our availability target?

Typical answers:

  • 100 million URLs created per day
  • 10 billion redirects per day (100:1 read/write ratio)
  • Redirect latency < 100ms
  • URLs stored for 5 years by default
  • 99.9% availability

Step 2: Estimate Scale

Do the math out loud. Interviewers want to see your reasoning.

Storage Estimation

URLs per day: 100 million
URLs per year: 100M × 365 = 36.5 billion
URLs over 5 years: 36.5B × 5 = 182.5 billion URLs

Average URL length: ~100 characters = 100 bytes
Short URL: 7 characters = 7 bytes
Metadata (created_at, user_id, etc.): ~50 bytes

Total per URL: ~157 bytes
Round up to: 200 bytes (for indexes, overhead)

Storage for 5 years: 182.5B × 200 bytes = 36.5 TB

36 TB is manageable. A single machine could store this, but we'll distribute for availability.

Throughput Estimation

Writes (URL creation):
100M per day = 100M / 86,400 seconds ≈ 1,160 URLs/second

Reads (redirects):
10B per day = 10B / 86,400 seconds ≈ 115,000 requests/second

115,000 QPS for reads is significant. We'll need caching.

Short URL Length

How long should the short code be?

Using base62 (a-z, A-Z, 0-9): 62 characters

6 characters: 62^6 = 56.8 billion combinations
7 characters: 62^7 = 3.5 trillion combinations

With 182.5 billion URLs over 5 years, 7 characters gives us plenty of room.

Conclusion: Use 7-character codes with base62 encoding.


Step 3: High-Level Design

Draw the architecture:

                                    ┌─────────────┐
                                    │   CDN/DNS   │
                                    └──────┬──────┘
                                           │
┌──────────┐     ┌──────────────┐    ┌─────▼─────┐
│  Client  │────▶│ Load Balancer│───▶│API Servers│
└──────────┘     └──────────────┘    └─────┬─────┘
                                           │
                      ┌────────────────────┼────────────────────┐
                      │                    │                    │
                ┌─────▼─────┐        ┌─────▼─────┐        ┌─────▼─────┐
                │   Cache   │        │  Database │        │ Analytics │
                │  (Redis)  │        │(PostgreSQL)│       │  (Kafka)  │
                └───────────┘        └───────────┘        └───────────┘

Components

1. Load Balancer

  • Distributes traffic across API servers
  • Health checks, SSL termination
  • Options: AWS ALB, Nginx, HAProxy

2. API Servers (Stateless)

  • Handle URL creation and redirect logic
  • Horizontally scalable
  • No local state, all state in database/cache

3. Cache (Redis)

  • Store frequently accessed short-to-long URL mappings
  • 115K QPS requires caching, can't hit database for every redirect
  • Cache hit rate target: 80%+

4. Database

  • Persistent storage for all URL mappings
  • PostgreSQL for ACID guarantees, or Cassandra for higher write throughput

5. Analytics Pipeline

  • Async processing for click tracking
  • Kafka for event streaming
  • Batch processing for analytics dashboards

Step 4: API Design

Define the endpoints:

Create Short URL

POST /api/v1/urls
Content-Type: application/json

Request:
{
  "long_url": "https://example.com/very/long/path?with=params",
  "custom_alias": "my-link",      // optional
  "expires_at": "2026-12-31"      // optional
}

Response (201 Created):
{
  "short_url": "https://short.url/abc1234",
  "long_url": "https://example.com/very/long/path?with=params",
  "created_at": "2026-01-05T10:30:00Z",
  "expires_at": "2026-12-31T00:00:00Z"
}

Redirect

GET /{short_code}

Response: 301 Redirect to long_url

Headers:
Location: https://example.com/very/long/path?with=params

Why 301 vs 302?

CodeTypeBrowser BehaviorUse Case
301PermanentBrowser caches, fewer server hitsWhen URL mapping never changes
302TemporaryBrowser doesn't cache, always hits serverWhen you need analytics or URL might change

For analytics, use 302 so every click hits your server.

Get URL Stats

GET /api/v1/urls/{short_code}/stats

Response:
{
  "short_url": "https://short.url/abc1234",
  "long_url": "https://example.com/...",
  "click_count": 15234,
  "created_at": "2026-01-05T10:30:00Z",
  "clicks_by_country": {
    "US": 8000,
    "UK": 3000,
    ...
  }
}

Step 5: Database Schema

Option A: Relational (PostgreSQL)

CREATE TABLE urls (
  id BIGSERIAL PRIMARY KEY,
  short_code VARCHAR(10) UNIQUE NOT NULL,
  long_url TEXT NOT NULL,
  user_id BIGINT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  expires_at TIMESTAMP,
  click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_user_id ON urls(user_id);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;

Pros: ACID transactions, flexible queries, mature tooling Cons: Harder to scale horizontally

Option B: NoSQL (Cassandra/DynamoDB)

Table: urls
Partition Key: short_code
Columns: long_url, user_id, created_at, expires_at, click_count

Pros: Horizontal scaling, high write throughput Cons: Limited query flexibility, eventual consistency

Which to Choose?

For a URL shortener, either works. The data model is simple (key-value lookup).

Choose PostgreSQL if:

  • You need complex queries (analytics by user, date ranges)
  • Team is familiar with SQL
  • Scale is moderate (< 100K writes/second)

Choose Cassandra/DynamoDB if:

  • Write throughput is critical
  • You need linear horizontal scaling
  • Simple access patterns (lookup by short_code)

For this design, I'll use PostgreSQL with read replicas for simplicity, with the option to migrate to Cassandra if write throughput becomes a bottleneck.


Step 6: Short Code Generation

This is where candidates differentiate themselves. There are three main approaches:

Approach 1: Hash the Long URL

import hashlib
import base64

def generate_short_code(long_url):
    hash_bytes = hashlib.md5(long_url.encode()).digest()
    base64_str = base64.urlsafe_b64encode(hash_bytes).decode()
    return base64_str[:7]  # Take first 7 characters

Pros:

  • Same URL always produces same short code (deduplication)
  • No coordination needed between servers

Cons:

  • Collisions possible (different URLs, same hash prefix)
  • Need collision handling logic

Collision handling:

def create_short_url(long_url):
    short_code = generate_short_code(long_url)

    for attempt in range(5):
        if not exists_in_db(short_code):
            save_to_db(short_code, long_url)
            return short_code
        # Collision: append attempt number and rehash
        short_code = generate_short_code(long_url + str(attempt))

    raise Exception("Could not generate unique short code")

Approach 2: Counter-Based (Auto-Increment)

def generate_short_code(counter_value):
    # Convert counter to base62
    characters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
    result = []

    while counter_value > 0:
        result.append(characters[counter_value % 62])
        counter_value //= 62

    return ''.join(reversed(result)).zfill(7)

# Counter: 1 → "0000001"
# Counter: 1000000 → "4c92"

Pros:

  • No collisions (counter is unique)
  • Simple to implement

Cons:

  • Predictable URLs (security concern)
  • Single counter = bottleneck
  • Need distributed counter for horizontal scaling

Approach 3: Pre-Generated Keys (Key Generation Service)

┌─────────────────┐
│ Key Generation  │
│    Service      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Key Database   │
│ (unused keys)   │
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌──────┐  ┌──────┐
│API 1 │  │API 2 │
│(keys)│  │(keys)│
└──────┘  └──────┘

How it works:

  1. Background service pre-generates millions of unique keys
  2. Stores them in a database table with "used" flag
  3. API servers fetch batches of keys (e.g., 1000 at a time)
  4. When a key is used, mark it as used
class KeyService:
    def __init__(self):
        self.local_keys = []

    def get_key(self):
        if not self.local_keys:
            # Fetch batch from key database
            self.local_keys = fetch_unused_keys(batch_size=1000)
        return self.local_keys.pop()

Pros:

  • No collisions
  • No coordination during URL creation (keys pre-assigned)
  • Not predictable (keys can be shuffled)

Cons:

  • More complex infrastructure
  • Keys in failed requests are "wasted"

Recommendation

For interviews, explain Approach 3 (Pre-Generated Keys). It shows you understand:

  • How to avoid coordination bottlenecks
  • How to handle distributed systems
  • Trade-offs between complexity and performance

Step 7: Caching Strategy

With 115K redirects/second, caching is essential.

What to Cache

Key: short_code
Value: long_url

Example:
"abc1234" → "https://example.com/long/path"

Cache Eviction

Use LRU (Least Recently Used) eviction:

  • Popular URLs stay in cache
  • Rarely accessed URLs get evicted

Cache Size Calculation

Assume 20% of URLs get 80% of traffic (Pareto principle)
Hot URLs: 182.5B × 20% = 36.5B URLs

But we only need to cache recently accessed ones.
If we cache URLs accessed in the last day:
100M URLs × 200 bytes = 20 GB

20 GB easily fits in memory. Add buffer → 64 GB cache.

Cache-Aside Pattern

def redirect(short_code):
    # 1. Check cache
    long_url = cache.get(short_code)

    if long_url:
        return redirect_to(long_url)

    # 2. Cache miss: check database
    long_url = db.get(short_code)

    if not long_url:
        return 404

    # 3. Update cache
    cache.set(short_code, long_url, ttl=3600)

    return redirect_to(long_url)

Handling Hot Keys

What if one URL goes viral and gets 1 million requests/second?

Problem: Single cache key becomes bottleneck.

Solutions:

  1. Replicate hot keys across cache nodes

    • Identify hot keys (>10K requests/second)
    • Replicate to multiple cache servers
    • Client randomly selects which replica to query
  2. Local caching on API servers

    • In-memory cache (e.g., Guava Cache) on each API server
    • Very short TTL (30 seconds)
    • Reduces load on Redis

Step 8: Analytics Deep Dive

If the interviewer asks about analytics, here's how to handle it:

Click Tracking Architecture

┌────────┐     ┌────────────┐     ┌─────────┐     ┌────────────┐
│Redirect│────▶│   Kafka    │────▶│ Spark/  │────▶│ Analytics  │
│  API   │     │  (Events)  │     │ Flink   │     │   DB       │
└────────┘     └────────────┘     └─────────┘     └────────────┘

Event Schema

{
  "short_code": "abc1234",
  "timestamp": "2026-01-05T10:30:00Z",
  "ip_address": "192.168.1.1",
  "user_agent": "Mozilla/5.0...",
  "referrer": "https://twitter.com/...",
  "country": "US",
  "city": "San Francisco"
}

Why Async?

Redirect latency must be < 100ms. We can't wait for analytics writes.

Solution: Fire-and-forget to Kafka, process asynchronously.

def redirect(short_code):
    long_url = get_long_url(short_code)

    # Async: don't block the redirect
    kafka.send_async("clicks", {
        "short_code": short_code,
        "timestamp": now(),
        "ip": request.ip,
        ...
    })

    return redirect_to(long_url)

Step 9: Handle Edge Cases

Strong candidates proactively address edge cases:

1. What if the database goes down?

Answer: Read replicas + cache keeps redirects working. Writes fail gracefully with queuing.

def create_short_url(long_url):
    try:
        return save_to_primary_db(long_url)
    except DatabaseUnavailable:
        # Queue for retry
        queue.send("pending_urls", long_url)
        return "URL creation delayed, check back soon"

2. What about duplicate long URLs?

Options:

  1. Allow duplicates: Same URL can have multiple short codes
  2. Deduplicate: Check if URL exists, return existing short code

Trade-off: Deduplication saves storage but requires extra lookup on every create.

3. How do you handle expired URLs?

# Background job runs every hour
def cleanup_expired_urls():
    expired = db.query(
        "SELECT short_code FROM urls WHERE expires_at < NOW()"
    )

    for url in expired:
        db.delete(url.short_code)
        cache.delete(url.short_code)

4. What about security/abuse?

  • Rate limiting: Max 100 URLs per user per hour
  • URL validation: Check for malware, phishing
  • Blacklisting: Block known malicious domains

Common Follow-Up Questions

"How would you scale this to 10x traffic?"

  1. Add more API servers (stateless, easy to scale)
  2. Increase cache cluster size
  3. Add database read replicas
  4. Consider sharding database by short_code prefix

"How would you implement custom aliases?"

def create_with_alias(long_url, alias):
    if not is_valid_alias(alias):
        return error("Invalid alias format")

    if exists_in_db(alias):
        return error("Alias already taken")

    save_to_db(alias, long_url)
    return success(alias)

"301 vs 302 redirect?"

  • 301 (Permanent): Browser caches, less server load, worse analytics
  • 302 (Temporary): Every click hits server, better analytics, more load

"How do you ensure high availability?"

  • Multiple data centers
  • Database replication (primary + replicas)
  • Cache replication
  • Load balancer failover
  • Health checks and auto-scaling

Summary: The Winning Answer

In a 45-minute interview, hit these points:

TimeWhat to Cover
0-5 minClarify requirements (features, scale, latency)
5-10 minEstimate scale (storage, QPS, short code length)
10-25 minHigh-level design (draw components, explain data flow)
25-40 minDeep dive (short code generation, caching, analytics)
40-45 minTrade-offs and edge cases

The key differentiators:

  • Ask clarifying questions before designing
  • Do the math to justify your choices
  • Explain trade-offs for every decision
  • Proactively address edge cases
  • Communicate clearly throughout---

Frequently Asked Questions

How is this different from a key-value store design?

A URL shortener is essentially a specialized key-value store. But the interview focuses on: short code generation algorithms, analytics requirements, and the extremely high read-to-write ratio.

Should I mention specific technologies?

Yes, but justify them. "I'd use Redis for caching because of its sub-millisecond latency and built-in LRU eviction" is better than just "I'd use Redis."

What if I'm asked about a feature I haven't considered?

That's fine. Think out loud: "I hadn't considered that. Let me think through how we'd approach it..." Interviewers want to see your problem-solving process.

How deep should I go on the database schema?

Show you understand indexing and query patterns. You don't need to design a normalized schema with every constraint, focus on the critical fields and indexes.

From the Interviewer's Side

Ready to Master System Design Interviews?

Learn from 25+ real interview problems from Netflix, Uber, Google, and Stripe. Created by a senior engineer who's taken 200+ system design interviews at FAANG companies.

Complete Solutions

Architecture diagrams & trade-off analysis

Real Interview Problems

From actual FAANG interviews

7-day money-back guarantee • Lifetime access • New problems added quarterly

FREE DOWNLOAD • 7-PAGE PDF

FREE: System Design Interview Cheat Sheet

Get the 7-page PDF cheat sheet with critical numbers, decision frameworks, and the interview approach used by 10,000+ engineers.

Includes:Critical NumbersDecision Frameworks35 Patterns5-Step Method

No spam. Unsubscribe anytime.