Design a URL Shortener: System Design Interview Answer
Complete guide to designing a URL shortener like bit.ly. Covers requirements, scale calculations, database design, and common follow-up questions.
Ready to Master System Design Interviews?
Learn from 25+ real interview problems from Netflix, Uber, Google, and Stripe. Created by a senior engineer who's taken 200+ system design interviews at FAANG companies.
Complete Solutions
Architecture diagrams & trade-off analysis
Real Interview Problems
From actual FAANG interviews
7-day money-back guarantee • Lifetime access • New problems added quarterly
"Design a URL shortener like bit.ly."
This is the most common system design interview question. It appears in interviews at Google, Amazon, Meta, and nearly every other tech company.
Why? Because it's deceptively simple. Anyone can describe the basic functionality. But a strong candidate reveals layers of depth: handling billions of URLs, sub-millisecond redirects, analytics at scale, and graceful failure handling.
This guide walks through exactly how to answer this question, from the clarifying questions that impress interviewers to the deep dives that separate strong from average candidates.
What the Interviewer Is Evaluating
Before diving in, understand what they're looking for:
- Requirement gathering , Do you ask questions, or dive in blindly?
- Scale estimation , Can you do back-of-envelope math?
- Database choices , Do you understand when to use SQL vs. NoSQL?
- API design , Can you design clean, RESTful endpoints?
- Algorithm knowledge , Do you know hashing, encoding, collision handling?
- Trade-off discussion , Can you articulate why you made choices?
The question is simple enough that weak candidates can sketch something. The depth you demonstrate determines whether you pass.
Step 1: Clarify Requirements
Never start designing immediately. Ask questions that show you think before coding.
Functional Requirements
Questions to ask:
"Let me clarify the functional requirements. What features should the URL shortener support?"
Typical features:
- Create short URL , Given a long URL, generate a short one
- Redirect , When someone visits the short URL, redirect to the original
- Custom aliases , Can users specify their own short URL? (e.g., bit.ly/my-link)
- Expiration , Should URLs expire after a certain time?
- Analytics , Do we need click tracking, geographic data, referrer info?
For this design, assume:
- Create and redirect (core features)
- Optional custom aliases
- Basic analytics (click count)
- URLs don't expire by default (user can set expiration)
Non-Functional Requirements
Questions to ask:
"What scale are we designing for? How many URLs per day, and what's our read/write ratio?"
Key questions:
- How many URLs shortened per day?
- How many redirects per day?
- What's the acceptable latency for redirects?
- How long should URLs persist?
- What's our availability target?
Typical answers:
- 100 million URLs created per day
- 10 billion redirects per day (100:1 read/write ratio)
- Redirect latency < 100ms
- URLs stored for 5 years by default
- 99.9% availability
Step 2: Estimate Scale
Do the math out loud. Interviewers want to see your reasoning.
Storage Estimation
URLs per day: 100 million
URLs per year: 100M × 365 = 36.5 billion
URLs over 5 years: 36.5B × 5 = 182.5 billion URLs
Average URL length: ~100 characters = 100 bytes
Short URL: 7 characters = 7 bytes
Metadata (created_at, user_id, etc.): ~50 bytes
Total per URL: ~157 bytes
Round up to: 200 bytes (for indexes, overhead)
Storage for 5 years: 182.5B × 200 bytes = 36.5 TB
36 TB is manageable. A single machine could store this, but we'll distribute for availability.
Throughput Estimation
Writes (URL creation):
100M per day = 100M / 86,400 seconds ≈ 1,160 URLs/second
Reads (redirects):
10B per day = 10B / 86,400 seconds ≈ 115,000 requests/second
115,000 QPS for reads is significant. We'll need caching.
Short URL Length
How long should the short code be?
Using base62 (a-z, A-Z, 0-9): 62 characters
6 characters: 62^6 = 56.8 billion combinations
7 characters: 62^7 = 3.5 trillion combinations
With 182.5 billion URLs over 5 years, 7 characters gives us plenty of room.
Conclusion: Use 7-character codes with base62 encoding.
Step 3: High-Level Design
Draw the architecture:
┌─────────────┐
│ CDN/DNS │
└──────┬──────┘
│
┌──────────┐ ┌──────────────┐ ┌─────▼─────┐
│ Client │────▶│ Load Balancer│───▶│API Servers│
└──────────┘ └──────────────┘ └─────┬─────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Cache │ │ Database │ │ Analytics │
│ (Redis) │ │(PostgreSQL)│ │ (Kafka) │
└───────────┘ └───────────┘ └───────────┘
Components
1. Load Balancer
- Distributes traffic across API servers
- Health checks, SSL termination
- Options: AWS ALB, Nginx, HAProxy
2. API Servers (Stateless)
- Handle URL creation and redirect logic
- Horizontally scalable
- No local state, all state in database/cache
3. Cache (Redis)
- Store frequently accessed short-to-long URL mappings
- 115K QPS requires caching, can't hit database for every redirect
- Cache hit rate target: 80%+
4. Database
- Persistent storage for all URL mappings
- PostgreSQL for ACID guarantees, or Cassandra for higher write throughput
5. Analytics Pipeline
- Async processing for click tracking
- Kafka for event streaming
- Batch processing for analytics dashboards
Step 4: API Design
Define the endpoints:
Create Short URL
POST /api/v1/urls
Content-Type: application/json
Request:
{
"long_url": "https://example.com/very/long/path?with=params",
"custom_alias": "my-link", // optional
"expires_at": "2026-12-31" // optional
}
Response (201 Created):
{
"short_url": "https://short.url/abc1234",
"long_url": "https://example.com/very/long/path?with=params",
"created_at": "2026-01-05T10:30:00Z",
"expires_at": "2026-12-31T00:00:00Z"
}
Redirect
GET /{short_code}
Response: 301 Redirect to long_url
Headers:
Location: https://example.com/very/long/path?with=params
Why 301 vs 302?
| Code | Type | Browser Behavior | Use Case |
|---|---|---|---|
| 301 | Permanent | Browser caches, fewer server hits | When URL mapping never changes |
| 302 | Temporary | Browser doesn't cache, always hits server | When you need analytics or URL might change |
For analytics, use 302 so every click hits your server.
Get URL Stats
GET /api/v1/urls/{short_code}/stats
Response:
{
"short_url": "https://short.url/abc1234",
"long_url": "https://example.com/...",
"click_count": 15234,
"created_at": "2026-01-05T10:30:00Z",
"clicks_by_country": {
"US": 8000,
"UK": 3000,
...
}
}
Step 5: Database Schema
Option A: Relational (PostgreSQL)
CREATE TABLE urls (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP,
click_count BIGINT DEFAULT 0
);
CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_user_id ON urls(user_id);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;
Pros: ACID transactions, flexible queries, mature tooling Cons: Harder to scale horizontally
Option B: NoSQL (Cassandra/DynamoDB)
Table: urls
Partition Key: short_code
Columns: long_url, user_id, created_at, expires_at, click_count
Pros: Horizontal scaling, high write throughput Cons: Limited query flexibility, eventual consistency
Which to Choose?
For a URL shortener, either works. The data model is simple (key-value lookup).
Choose PostgreSQL if:
- You need complex queries (analytics by user, date ranges)
- Team is familiar with SQL
- Scale is moderate (< 100K writes/second)
Choose Cassandra/DynamoDB if:
- Write throughput is critical
- You need linear horizontal scaling
- Simple access patterns (lookup by short_code)
For this design, I'll use PostgreSQL with read replicas for simplicity, with the option to migrate to Cassandra if write throughput becomes a bottleneck.
Step 6: Short Code Generation
This is where candidates differentiate themselves. There are three main approaches:
Approach 1: Hash the Long URL
import hashlib
import base64
def generate_short_code(long_url):
hash_bytes = hashlib.md5(long_url.encode()).digest()
base64_str = base64.urlsafe_b64encode(hash_bytes).decode()
return base64_str[:7] # Take first 7 characters
Pros:
- Same URL always produces same short code (deduplication)
- No coordination needed between servers
Cons:
- Collisions possible (different URLs, same hash prefix)
- Need collision handling logic
Collision handling:
def create_short_url(long_url):
short_code = generate_short_code(long_url)
for attempt in range(5):
if not exists_in_db(short_code):
save_to_db(short_code, long_url)
return short_code
# Collision: append attempt number and rehash
short_code = generate_short_code(long_url + str(attempt))
raise Exception("Could not generate unique short code")
Approach 2: Counter-Based (Auto-Increment)
def generate_short_code(counter_value):
# Convert counter to base62
characters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
result = []
while counter_value > 0:
result.append(characters[counter_value % 62])
counter_value //= 62
return ''.join(reversed(result)).zfill(7)
# Counter: 1 → "0000001"
# Counter: 1000000 → "4c92"
Pros:
- No collisions (counter is unique)
- Simple to implement
Cons:
- Predictable URLs (security concern)
- Single counter = bottleneck
- Need distributed counter for horizontal scaling
Approach 3: Pre-Generated Keys (Key Generation Service)
┌─────────────────┐
│ Key Generation │
│ Service │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Key Database │
│ (unused keys) │
└────────┬────────┘
│
┌────┴────┐
▼ ▼
┌──────┐ ┌──────┐
│API 1 │ │API 2 │
│(keys)│ │(keys)│
└──────┘ └──────┘
How it works:
- Background service pre-generates millions of unique keys
- Stores them in a database table with "used" flag
- API servers fetch batches of keys (e.g., 1000 at a time)
- When a key is used, mark it as used
class KeyService:
def __init__(self):
self.local_keys = []
def get_key(self):
if not self.local_keys:
# Fetch batch from key database
self.local_keys = fetch_unused_keys(batch_size=1000)
return self.local_keys.pop()
Pros:
- No collisions
- No coordination during URL creation (keys pre-assigned)
- Not predictable (keys can be shuffled)
Cons:
- More complex infrastructure
- Keys in failed requests are "wasted"
Recommendation
For interviews, explain Approach 3 (Pre-Generated Keys). It shows you understand:
- How to avoid coordination bottlenecks
- How to handle distributed systems
- Trade-offs between complexity and performance
Step 7: Caching Strategy
With 115K redirects/second, caching is essential.
What to Cache
Key: short_code
Value: long_url
Example:
"abc1234" → "https://example.com/long/path"
Cache Eviction
Use LRU (Least Recently Used) eviction:
- Popular URLs stay in cache
- Rarely accessed URLs get evicted
Cache Size Calculation
Assume 20% of URLs get 80% of traffic (Pareto principle)
Hot URLs: 182.5B × 20% = 36.5B URLs
But we only need to cache recently accessed ones.
If we cache URLs accessed in the last day:
100M URLs × 200 bytes = 20 GB
20 GB easily fits in memory. Add buffer → 64 GB cache.
Cache-Aside Pattern
def redirect(short_code):
# 1. Check cache
long_url = cache.get(short_code)
if long_url:
return redirect_to(long_url)
# 2. Cache miss: check database
long_url = db.get(short_code)
if not long_url:
return 404
# 3. Update cache
cache.set(short_code, long_url, ttl=3600)
return redirect_to(long_url)
Handling Hot Keys
What if one URL goes viral and gets 1 million requests/second?
Problem: Single cache key becomes bottleneck.
Solutions:
-
Replicate hot keys across cache nodes
- Identify hot keys (>10K requests/second)
- Replicate to multiple cache servers
- Client randomly selects which replica to query
-
Local caching on API servers
- In-memory cache (e.g., Guava Cache) on each API server
- Very short TTL (30 seconds)
- Reduces load on Redis
Step 8: Analytics Deep Dive
If the interviewer asks about analytics, here's how to handle it:
Click Tracking Architecture
┌────────┐ ┌────────────┐ ┌─────────┐ ┌────────────┐
│Redirect│────▶│ Kafka │────▶│ Spark/ │────▶│ Analytics │
│ API │ │ (Events) │ │ Flink │ │ DB │
└────────┘ └────────────┘ └─────────┘ └────────────┘
Event Schema
{
"short_code": "abc1234",
"timestamp": "2026-01-05T10:30:00Z",
"ip_address": "192.168.1.1",
"user_agent": "Mozilla/5.0...",
"referrer": "https://twitter.com/...",
"country": "US",
"city": "San Francisco"
}
Why Async?
Redirect latency must be < 100ms. We can't wait for analytics writes.
Solution: Fire-and-forget to Kafka, process asynchronously.
def redirect(short_code):
long_url = get_long_url(short_code)
# Async: don't block the redirect
kafka.send_async("clicks", {
"short_code": short_code,
"timestamp": now(),
"ip": request.ip,
...
})
return redirect_to(long_url)
Step 9: Handle Edge Cases
Strong candidates proactively address edge cases:
1. What if the database goes down?
Answer: Read replicas + cache keeps redirects working. Writes fail gracefully with queuing.
def create_short_url(long_url):
try:
return save_to_primary_db(long_url)
except DatabaseUnavailable:
# Queue for retry
queue.send("pending_urls", long_url)
return "URL creation delayed, check back soon"
2. What about duplicate long URLs?
Options:
- Allow duplicates: Same URL can have multiple short codes
- Deduplicate: Check if URL exists, return existing short code
Trade-off: Deduplication saves storage but requires extra lookup on every create.
3. How do you handle expired URLs?
# Background job runs every hour
def cleanup_expired_urls():
expired = db.query(
"SELECT short_code FROM urls WHERE expires_at < NOW()"
)
for url in expired:
db.delete(url.short_code)
cache.delete(url.short_code)
4. What about security/abuse?
- Rate limiting: Max 100 URLs per user per hour
- URL validation: Check for malware, phishing
- Blacklisting: Block known malicious domains
Common Follow-Up Questions
"How would you scale this to 10x traffic?"
- Add more API servers (stateless, easy to scale)
- Increase cache cluster size
- Add database read replicas
- Consider sharding database by short_code prefix
"How would you implement custom aliases?"
def create_with_alias(long_url, alias):
if not is_valid_alias(alias):
return error("Invalid alias format")
if exists_in_db(alias):
return error("Alias already taken")
save_to_db(alias, long_url)
return success(alias)
"301 vs 302 redirect?"
- 301 (Permanent): Browser caches, less server load, worse analytics
- 302 (Temporary): Every click hits server, better analytics, more load
"How do you ensure high availability?"
- Multiple data centers
- Database replication (primary + replicas)
- Cache replication
- Load balancer failover
- Health checks and auto-scaling
Summary: The Winning Answer
In a 45-minute interview, hit these points:
| Time | What to Cover |
|---|---|
| 0-5 min | Clarify requirements (features, scale, latency) |
| 5-10 min | Estimate scale (storage, QPS, short code length) |
| 10-25 min | High-level design (draw components, explain data flow) |
| 25-40 min | Deep dive (short code generation, caching, analytics) |
| 40-45 min | Trade-offs and edge cases |
The key differentiators:
- Ask clarifying questions before designing
- Do the math to justify your choices
- Explain trade-offs for every decision
- Proactively address edge cases
- Communicate clearly throughout---
Frequently Asked Questions
How is this different from a key-value store design?
A URL shortener is essentially a specialized key-value store. But the interview focuses on: short code generation algorithms, analytics requirements, and the extremely high read-to-write ratio.
Should I mention specific technologies?
Yes, but justify them. "I'd use Redis for caching because of its sub-millisecond latency and built-in LRU eviction" is better than just "I'd use Redis."
What if I'm asked about a feature I haven't considered?
That's fine. Think out loud: "I hadn't considered that. Let me think through how we'd approach it..." Interviewers want to see your problem-solving process.
How deep should I go on the database schema?
Show you understand indexing and query patterns. You don't need to design a normalized schema with every constraint, focus on the critical fields and indexes.
Ready to Master System Design Interviews?
Learn from 25+ real interview problems from Netflix, Uber, Google, and Stripe. Created by a senior engineer who's taken 200+ system design interviews at FAANG companies.
Complete Solutions
Architecture diagrams & trade-off analysis
Real Interview Problems
From actual FAANG interviews
7-day money-back guarantee • Lifetime access • New problems added quarterly
FREE: System Design Interview Cheat Sheet
Get the 7-page PDF cheat sheet with critical numbers, decision frameworks, and the interview approach used by 10,000+ engineers.
No spam. Unsubscribe anytime.
Related Articles
Why Distributed Systems Fail: 15 Failure Scenarios Every Engineer Must Know
A comprehensive guide to the most common failure modes in distributed systems, from network partitions to split-brain scenarios, with practical fixes for each.
Read moreThe 7 System Design Problems You Must Know Before Your Interview
These 7 system design questions appear in 80% of interviews at Google, Meta, Amazon, and Netflix. Master them, and you can handle any variation.
Read moreAmazon System Design Interview: Leadership Principles Meet Distributed Systems
How Amazon's system design interviews differ from other FAANG companies. Real questions, LP integration, and what bar raisers actually look for.
Read more