Back to Blog
system-designtwitterinterviewsocial-media

Design Twitter: System Design Interview Complete Guide

How to design Twitter's timeline, tweet posting, and follow system. Covers fan-out strategies, real-time delivery, and scaling to billions of users.

13 min readBy SystemExperts
From the Interviewer's Side

Ready to Master System Design Interviews?

Learn from 25+ real interview problems from Netflix, Uber, Google, and Stripe. Created by a senior engineer who's taken 200+ system design interviews at FAANG companies.

Complete Solutions

Architecture diagrams & trade-off analysis

Real Interview Problems

From actual FAANG interviews

7-day money-back guarantee • Lifetime access • New problems added quarterly

"Design Twitter" is one of the most asked system design interview questions. It appears at Google, Meta, Amazon, and virtually every tech company because it tests fundamental distributed systems concepts:

  • Fan-out problem , How do you deliver a tweet to millions of followers?
  • Read vs. write optimization , Timeline reads vastly outnumber tweet writes
  • Real-time delivery , Users expect tweets to appear instantly
  • Scaling social graphs , Handling follows between billions of users

This guide walks through a complete answer, from requirements clarification to deep dives on the trickiest components.


Step 1: Clarify Requirements

Never jump into design. Start by scoping the problem.

Functional Requirements

Ask: "What features should we support?"

Core features (must have):

  • Post a tweet (280 characters, optionally with media)
  • Follow/unfollow users
  • View home timeline (tweets from people you follow)
  • View user profile/timeline

Extended features (nice to have, ask if in scope):

  • Likes and retweets
  • Direct messages
  • Hashtags and search
  • Trending topics
  • Notifications

For this design, focus on: Posting tweets, following, and home timeline. These cover the core algorithmic challenges.

Non-Functional Requirements

Ask: "What scale are we designing for?"

Typical answers:

  • 500 million total users
  • 200 million daily active users (DAU)
  • 500 million tweets per day
  • Average user follows 200 people
  • Average user has 200 followers
  • Celebrity accounts: some users have 50+ million followers
  • Timeline reads: 10 billion per day

Derived metrics:

  • Read:Write ratio: 20:1 (timelines read 20x more than tweets posted)
  • Tweets per second: 500M / 86,400 ≈ 5,800 tweets/sec
  • Timeline reads per second: 10B / 86,400 ≈ 116,000 reads/sec

Latency requirements:

  • Tweet posting: < 500ms
  • Timeline load: < 200ms
  • Follow action: < 200ms

Step 2: High-Level Design

Let's sketch the major components:

┌─────────────────────────────────────────────────────────────────┐
│                         Clients                                  │
│              (Mobile Apps, Web, Third-party)                     │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                        API Gateway                               │
│           (Authentication, Rate Limiting, Routing)               │
└─────────────────────────────────────────────────────────────────┘
                              │
            ┌─────────────────┼─────────────────┐
            ▼                 ▼                 ▼
     ┌────────────┐    ┌────────────┐    ┌────────────┐
     │   Tweet    │    │  Timeline  │    │   User     │
     │  Service   │    │  Service   │    │  Service   │
     └────────────┘    └────────────┘    └────────────┘
            │                 │                 │
            ▼                 ▼                 ▼
     ┌────────────┐    ┌────────────┐    ┌────────────┐
     │ Tweet Store│    │  Timeline  │    │ User/Graph │
     │(Cassandra) │    │Cache(Redis)│    │  Store     │
     └────────────┘    └────────────┘    └────────────┘
            │
            ▼
     ┌────────────┐
     │  Fan-out   │
     │  Service   │
     └────────────┘

Core Components

1. Tweet Service

  • Handles tweet creation, storage, and retrieval
  • Stores tweets in a distributed database
  • Triggers fan-out to followers' timelines

2. Timeline Service

  • Serves home timeline requests
  • Reads from precomputed timeline cache
  • Handles timeline generation for pull-based timelines

3. User Service

  • Manages user profiles
  • Handles follow/unfollow operations
  • Maintains the social graph

4. Fan-out Service

  • Distributes tweets to followers' timelines
  • Key component that determines system architecture

Step 3: The Fan-Out Problem (Core Challenge)

This is the heart of the Twitter design. When User A posts a tweet, how does it appear in all their followers' timelines?

Three Approaches

1. Fan-out on Write (Push Model)

When a user posts a tweet:

  1. Write tweet to Tweet Store
  2. Look up all followers
  3. Write tweet ID to each follower's timeline cache
User A posts tweet
       │
       ▼
┌─────────────────┐
│ Tweet Service   │
│ (stores tweet)  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐     ┌─────────────────────────────────┐
│ Fan-out Service │────▶│ Write to 1000 followers'        │
│                 │     │ timeline caches                  │
└─────────────────┘     └─────────────────────────────────┘

Pros:

  • Timeline reads are instant (just read from cache)
  • Scales well for reads

Cons:

  • Celebrity problem: 50M followers = 50M writes per tweet
  • Wasted work for inactive followers
  • Slow tweet posting for celebrities

2. Fan-out on Read (Pull Model)

When a user loads their timeline:

  1. Look up all users they follow
  2. Fetch recent tweets from each
  3. Merge and sort
User B loads timeline
       │
       ▼
┌─────────────────┐
│Timeline Service │
│                 │
└────────┬────────┘
         │
         ▼
┌─────────────────────────────────────────────────────┐
│ 1. Get 200 users that B follows                      │
│ 2. Fetch recent tweets from each (200 queries)       │
│ 3. Merge, sort by time                               │
│ 4. Return top 100                                    │
└─────────────────────────────────────────────────────┘

Pros:

  • No write amplification
  • Tweet posting is fast for everyone

Cons:

  • Timeline reads are slow (hundreds of queries)
  • Doesn't scale for active users

3. Hybrid Approach (What Twitter Actually Does)

Combine both approaches based on follower count:

IF poster has < 10,000 followers:
    Fan-out on write (push to all followers' caches)
ELSE:
    Don't fan-out (celebrity tweets pulled at read time)

ON timeline read:
    1. Read precomputed timeline from cache
    2. Fetch recent tweets from followed celebrities
    3. Merge and return

Why this works:

  • 99% of users have < 10,000 followers → fast push
  • 1% celebrities → avoid 50M writes per tweet
  • Timeline read adds a few extra queries for celebrity tweets (acceptable)

Hybrid Architecture

┌─────────────────────────────────────────────────────────────────┐
│                      Tweet Posted                                │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
                    ┌─────────────────┐
                    │ Follower count  │
                    │    < 10,000?    │
                    └────────┬────────┘
                             │
              ┌──────────────┴──────────────┐
              │ YES                         │ NO
              ▼                             ▼
    ┌─────────────────┐           ┌─────────────────┐
    │ Fan-out to all  │           │ Store in Tweet  │
    │ followers'      │           │ Store only      │
    │ timeline caches │           │ (marked as      │
    └─────────────────┘           │  celebrity)     │
                                  └─────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│                    Timeline Request                              │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│ 1. Read precomputed timeline from cache (pushed tweets)          │
│ 2. Get list of followed celebrities                              │
│ 3. Fetch recent tweets from each celebrity (pull)                │
│ 4. Merge pushed + pulled tweets by timestamp                     │
│ 5. Return top N tweets                                           │
└─────────────────────────────────────────────────────────────────┘

Step 4: Data Models

Tweet Schema

-- Tweet Store (Cassandra or DynamoDB)
CREATE TABLE tweets (
    tweet_id BIGINT PRIMARY KEY,      -- Snowflake ID (time-sortable)
    user_id BIGINT,
    content TEXT,
    media_urls LIST<TEXT>,
    created_at TIMESTAMP,
    like_count INT,
    retweet_count INT,
    reply_count INT
);

-- Secondary index for user timeline
CREATE TABLE user_tweets (
    user_id BIGINT,
    tweet_id BIGINT,
    created_at TIMESTAMP,
    PRIMARY KEY (user_id, tweet_id)
) WITH CLUSTERING ORDER BY (tweet_id DESC);

Why Cassandra?

  • High write throughput (500M tweets/day)
  • Horizontal scaling
  • Time-series friendly (tweets sorted by time)

Timeline Cache

-- Redis structure for home timeline
Key: timeline:{user_id}
Value: Sorted Set of tweet_ids, scored by timestamp

Example:
timeline:12345 = {
    tweet_98765: 1704412800000,  // Jan 5, 2026 12:00:00
    tweet_98764: 1704412799000,  // Jan 5, 2026 11:59:59
    ...
}

Why Redis?

  • Sub-millisecond reads
  • Sorted sets perfect for timeline ordering
  • Can trim to last N tweets automatically

Social Graph

-- Follow relationships
CREATE TABLE follows (
    follower_id BIGINT,
    followee_id BIGINT,
    created_at TIMESTAMP,
    PRIMARY KEY (follower_id, followee_id)
);

-- Reverse index for "who follows me"
CREATE TABLE followers (
    followee_id BIGINT,
    follower_id BIGINT,
    created_at TIMESTAMP,
    PRIMARY KEY (followee_id, follower_id)
);

Why two tables?

  • follows: "Who do I follow?" , for timeline generation
  • followers: "Who follows me?" , for fan-out

Step 5: API Design

Post Tweet

POST /api/v1/tweets
Authorization: Bearer {token}

Request:
{
    "content": "Hello, world!",
    "media_ids": ["abc123"],  // optional, pre-uploaded
    "reply_to": "tweet_98765"  // optional
}

Response (201 Created):
{
    "tweet_id": "tweet_98766",
    "content": "Hello, world!",
    "created_at": "2026-01-05T12:00:00Z",
    "user": {
        "id": "12345",
        "username": "johndoe"
    }
}

Get Home Timeline

GET /api/v1/timeline/home?cursor={cursor}&limit=20
Authorization: Bearer {token}

Response:
{
    "tweets": [
        {
            "tweet_id": "tweet_98766",
            "content": "Hello, world!",
            "created_at": "2026-01-05T12:00:00Z",
            "user": {...},
            "like_count": 42,
            "retweet_count": 5
        },
        ...
    ],
    "next_cursor": "tweet_98746"
}

Follow User

POST /api/v1/users/{user_id}/follow
Authorization: Bearer {token}

Response (200 OK):
{
    "following": true
}

Step 6: Tweet Posting Flow (Deep Dive)

Let's trace what happens when a user posts a tweet:

Client posts tweet
        │
        ▼
┌───────────────────┐
│    API Gateway    │
│  (auth, rate limit)│
└─────────┬─────────┘
          │
          ▼
┌───────────────────┐
│   Tweet Service   │
│                   │
│ 1. Validate content│
│ 2. Generate ID    │
│ 3. Store tweet    │
│ 4. Send to Kafka  │
└─────────┬─────────┘
          │
          ▼
┌───────────────────┐
│      Kafka        │
│  (tweet_posted    │
│   topic)          │
└─────────┬─────────┘
          │
          ▼
┌───────────────────┐
│  Fan-out Worker   │
│                   │
│ 1. Get follower   │
│    list           │
│ 2. If < 10K,      │
│    push to caches │
│ 3. Else, mark     │
│    as celebrity   │
└───────────────────┘

ID Generation: Snowflake IDs

Twitter invented Snowflake IDs for unique, time-sortable IDs:

┌─────────────────────────────────────────────────────────────┐
│  Snowflake ID (64 bits)                                      │
├──────────────┬────────────┬────────────────┬────────────────┤
│   1 bit      │  41 bits   │   10 bits      │   12 bits      │
│   (unused)   │(timestamp) │ (machine ID)   │ (sequence)     │
└──────────────┴────────────┴────────────────┴────────────────┘

Why Snowflake?

  • Unique without coordination (machine ID + sequence)
  • Time-sortable (can order by ID instead of timestamp)
  • 64-bit fits in a long integer

Fan-out Worker Detail

def fan_out_tweet(tweet_id, user_id):
    # Get follower list
    followers = get_followers(user_id)
    follower_count = len(followers)

    if follower_count > CELEBRITY_THRESHOLD:  # e.g., 10,000
        # Mark as celebrity tweet, don't fan out
        mark_celebrity_tweet(tweet_id, user_id)
        return

    # Fan out to all followers
    for batch in batched(followers, 1000):
        # Parallel writes to Redis
        pipeline = redis.pipeline()
        for follower_id in batch:
            pipeline.zadd(
                f"timeline:{follower_id}",
                {tweet_id: timestamp}
            )
        pipeline.execute()

        # Trim timeline to last 800 tweets
        for follower_id in batch:
            redis.zremrangebyrank(
                f"timeline:{follower_id}",
                0, -801  # Keep last 800
            )

Step 7: Timeline Read Flow (Deep Dive)

Client requests timeline
        │
        ▼
┌───────────────────┐
│    API Gateway    │
└─────────┬─────────┘
          │
          ▼
┌───────────────────┐
│ Timeline Service  │
│                   │
│ 1. Read from cache│
│ 2. Get celebrity  │
│    tweets         │
│ 3. Merge & sort   │
│ 4. Hydrate tweets │
│ 5. Return         │
└─────────┬─────────┘
          │
          ▼
┌───────────────────────────────────────────────────────────────┐
│                        Response                                │
└───────────────────────────────────────────────────────────────┘

Implementation

def get_home_timeline(user_id, cursor=None, limit=20):
    # Step 1: Get precomputed timeline from cache
    cached_tweet_ids = redis.zrevrangebyscore(
        f"timeline:{user_id}",
        max=cursor or "+inf",
        min="-inf",
        start=0,
        num=limit
    )

    # Step 2: Get celebrity tweets
    followed_celebrities = get_followed_celebrities(user_id)
    celebrity_tweets = []

    for celebrity_id in followed_celebrities:
        recent_tweets = get_recent_tweets(celebrity_id, limit=5)
        celebrity_tweets.extend(recent_tweets)

    # Step 3: Merge and sort
    all_tweet_ids = cached_tweet_ids + [t.id for t in celebrity_tweets]
    all_tweet_ids.sort(key=lambda x: get_timestamp(x), reverse=True)
    all_tweet_ids = all_tweet_ids[:limit]

    # Step 4: Hydrate (fetch full tweet data)
    tweets = batch_get_tweets(all_tweet_ids)

    # Step 5: Return with cursor for pagination
    next_cursor = tweets[-1].id if len(tweets) == limit else None
    return {"tweets": tweets, "next_cursor": next_cursor}

Caching Strategy

┌─────────────────────────────────────────────────────────────────┐
│                     Cache Layers                                 │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Layer 1: CDN (static assets, not timelines)                    │
│                                                                  │
│  Layer 2: Timeline Cache (Redis)                                 │
│    - Precomputed timeline (pushed tweet IDs)                    │
│    - TTL: Forever (updated on tweet post)                       │
│    - Eviction: Keep last 800 tweets per user                    │
│                                                                  │
│  Layer 3: Tweet Cache (Redis)                                    │
│    - Full tweet objects                                          │
│    - TTL: 24 hours                                               │
│    - Eviction: LRU                                               │
│                                                                  │
│  Layer 4: User Cache (Redis)                                     │
│    - User profile data                                           │
│    - TTL: 1 hour                                                 │
│    - Eviction: LRU                                               │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Step 8: Scaling Considerations

Database Sharding

Tweet Store:

  • Shard by tweet_id (even distribution)
  • Alternatively, shard by user_id (keeps user's tweets together)

User Timeline Table:

  • Shard by user_id (efficient for "get my tweets")

Follows Table:

  • Shard by follower_id (efficient for "who do I follow")

Timeline Cache Scaling

Problem: Hot users (celebrities) have many followers reading their tweets.

Solutions:

  1. Cache replication: Multiple Redis replicas, route reads randomly
  2. Local caching: API servers cache hot tweets in memory (30-second TTL)

Fan-out Worker Scaling

Problem: Celebrity with 50M followers decides to post.

Even with hybrid approach, some accounts between 10K-1M still cause significant fan-out.

Solutions:

  1. Async processing: Don't block tweet post on fan-out completion
  2. Batched writes: Write to Redis in batches of 1000
  3. Rate limiting on writes: Spread fan-out over time
  4. Priority queues: Process high-engagement users first

Step 9: Additional Features (If Asked)

Search

┌─────────────────────────────────────────────────────────────────┐
│                    Search Architecture                           │
└─────────────────────────────────────────────────────────────────┘

        Tweet posted
             │
             ▼
┌────────────────────┐
│  Elasticsearch     │◀── Index tweet: content, hashtags, user
│  Cluster           │
└────────────────────┘
             │
             ▼
┌────────────────────┐
│  Search Service    │◀── Query: "system design"
│                    │     → Returns ranked tweet IDs
└────────────────────┘

Trending Topics

def compute_trending():
    # Stream processing with Kafka + Flink

    # 1. Extract hashtags from tweets
    # 2. Count per time window (5 minutes)
    # 3. Compare to baseline (normal volume)
    # 4. Rank by velocity (growth rate), not just count
    # 5. Filter bots and spam
    # 6. Geographic segmentation

    return trending_topics

Notifications

┌─────────────────────────────────────────────────────────────────┐
│                 Notification Types                               │
├─────────────────────────────────────────────────────────────────┤
│ 1. Someone followed you                                          │
│ 2. Someone liked your tweet                                      │
│ 3. Someone replied to your tweet                                 │
│ 4. Someone mentioned you                                         │
└─────────────────────────────────────────────────────────────────┘

        Event occurs
             │
             ▼
┌────────────────────┐
│   Notification     │
│   Service          │
│                    │
│ 1. Check user prefs│
│ 2. Rate limit      │
│ 3. Write to inbox  │
│ 4. Push via FCM    │
└────────────────────┘

Step 10: Common Follow-Up Questions

"What happens when a user with 50M followers posts?"

With hybrid approach:

  1. Tweet stored in Tweet Store
  2. Marked as celebrity tweet (no fan-out)
  3. On timeline read, celebrity tweets are pulled and merged
  4. Celebrity tweet cache can be replicated for read scaling

"How do you handle the case where a user unfollows someone?"

Two options:

  1. Lazy removal: Tweet stays in cache, filter on read (simpler)
  2. Active removal: Send unfollow event to remove tweets from cache (more accurate)

Recommendation: Lazy removal. The cached tweet will eventually be pushed out by newer tweets. Cost of active removal doesn't justify perfect accuracy.

"How do you ensure timeline consistency?"

Challenge: User posts tweet, immediately checks their own timeline, doesn't see it.

Solutions:

  1. Read-your-writes consistency: After posting, read from leader/cache-write, not replica
  2. Include recent self-tweets: Always merge last N self-tweets with timeline
  3. Client-side optimistic update: Show tweet immediately, sync in background

"How do you handle tweet deletion?"

  1. Mark tweet as deleted in database (soft delete)
  2. Remove from timeline caches of author's followers
  3. For celebrity tweets, removal happens on next cache refresh
  4. Keep deleted tweets for compliance/audit, just don't display

"What about ranked timelines vs. chronological?"

Ranked timeline architecture:

  1. Fetch candidate tweets (same as chronological)
  2. Score each tweet with ML model (engagement prediction)
  3. Factors: author affinity, content relevance, recency, engagement
  4. Return top N by score

Trade-off: Latency vs. ranking quality. Can pre-compute scores or do real-time.


Summary: The Complete Answer

In a 45-minute interview, hit these points:

TimeWhat to Cover
0-5 minRequirements: features, scale, latency
5-10 minHigh-level design: services, data stores
10-25 minFan-out problem: push vs. pull vs. hybrid
25-35 minDeep dives: posting flow, timeline read, IDs
35-45 minScaling, follow-ups, trade-offs

Key differentiators:

  • Understand the fan-out problem deeply
  • Know the hybrid approach and why it works
  • Can explain Snowflake IDs
  • Discuss trade-offs at each decision
  • Handle celebrity edge cases

From the Interviewer's Side

Ready to Master System Design Interviews?

Learn from 25+ real interview problems from Netflix, Uber, Google, and Stripe. Created by a senior engineer who's taken 200+ system design interviews at FAANG companies.

Complete Solutions

Architecture diagrams & trade-off analysis

Real Interview Problems

From actual FAANG interviews

7-day money-back guarantee • Lifetime access • New problems added quarterly

FREE DOWNLOAD • 7-PAGE PDF

FREE: System Design Interview Cheat Sheet

Get the 7-page PDF cheat sheet with critical numbers, decision frameworks, and the interview approach used by 10,000+ engineers.

Includes:Critical NumbersDecision Frameworks35 Patterns5-Step Method

No spam. Unsubscribe anytime.