Design Walkthrough
Problem Statement
The Question: Design a cloud file storage and synchronization system like Dropbox or Google Drive.
The system must support: - File upload/download: Store files in the cloud, access from anywhere - Multi-device sync: Changes on one device appear on all devices - Offline support: Work without internet, sync when back online - Sharing: Share files/folders with other users - Version history: Recover previous versions of files
What to say first
Before I design, let me clarify the requirements. The sync functionality is the hard part here - not just storing files, but keeping them synchronized across devices efficiently. I want to understand the scale, consistency requirements, and how we handle conflicts.
Hidden requirements interviewers are testing: - Do you understand why chunking is essential (not just uploading whole files)? - Can you design efficient delta sync (only transfer what changed)? - How do you handle conflicts when same file edited on multiple devices? - Do you separate metadata from content (crucial for scalability)?
Clarifying Questions
Ask these questions to shape your architecture. Each answer has significant design implications.
Question 1: File Size Distribution
What is the typical file size? Are we handling mostly documents (KB-MB) or also large files like videos (GB)?
Why this matters: Determines chunking strategy and upload approach. Typical answer: Mix of small docs and large media, up to 50GB files Architecture impact: Need chunked uploads, resumable transfers, streaming
Question 2: Conflict Resolution
When the same file is edited on two devices while offline, how should we resolve the conflict?
Why this matters: Defines consistency model and user experience. Typical answer: Create conflicted copies (like Dropbox) rather than silent overwrite Architecture impact: Need version vectors, conflict detection, branching
Question 3: Sync Latency
How quickly should changes appear on other devices? Real-time or eventual?
Why this matters: Real-time needs push notifications/websockets, eventual can use polling. Typical answer: Near real-time (within seconds) when online Architecture impact: Need notification service, long-polling or websockets
Question 4: Deduplication Scope
Should we deduplicate across all users (global) or just within a single user account?
Why this matters: Global dedup saves more storage but has privacy implications. Typical answer: Global dedup for storage efficiency (Dropbox does this) Architecture impact: Content-addressable storage, hash-based chunk identification
Stating assumptions
I will assume: 500M users, 100B files total, average 100 files per user, mix of small docs and large media up to 50GB, near real-time sync when online, global deduplication, conflicted copies for conflict resolution.
The Hard Part
Say this out loud
The hard part here is efficient synchronization. Not just storing files, but detecting what changed, transferring only the changes, and handling conflicts when the same file is edited on multiple devices.
Why sync is genuinely hard:
- 1.Delta Sync: A user edits 1 byte in a 1GB file. Uploading the entire file wastes bandwidth. We need to detect and transfer only the changed portion.
- 2.Conflict Detection: User A edits file on laptop, User A (or B) edits same file on phone, both offline. When they come online, which version wins?
- 3.Ordering: File renamed on device A, content edited on device B. Order matters - we need causal consistency.
- 4.Offline Duration: User offline for a week making changes. Syncing hundreds of changes efficiently when back online.
Common mistake
Candidates often design a simple upload/download system and forget sync entirely. The interviewer wants to see how you handle the sync protocol, not just blob storage.
The chunking insight:
Chunking solves multiple problems at once:
Without chunking:
- 1GB file, 1 byte change = 1GB upload
- Upload fails at 99% = start over
- Same file uploaded by 1M users = 1PB storage
With chunking (4MB chunks):
- 1GB file, 1 byte change = 4MB upload (just the changed chunk)
- Upload fails at 99% = resume from last chunk
- Same file uploaded by 1M users = only unique chunks stored (dedup)Why Chunking Matters
Scale & Access Patterns
Let me estimate the scale. This drives architecture decisions.
| Dimension | Value | Impact |
|---|---|---|
| Total Users | 500 million | Massive metadata scale |
| Files per User | ~100 average | 50 billion total files |
What to say
At this scale, we cannot store file metadata in the same system as file content. Metadata is small but needs strong consistency. Content is large and can be eventually consistent. This separation is key.
Storage:
- 500M users x 2GB average = 1 Exabyte raw storage
- With deduplication: ~30% reduction = 700 PBAccess Pattern Analysis:
- Write pattern: Bursty (user saves file, batch of chunks uploaded) - Read pattern: Sequential (download file = download chunks in order) - Metadata access: Random (check if file changed, get chunk list) - Hot data: Recently modified files (80% of access is last 20% of files) - Geographic: Users clustered by region, files accessed from same region
High-Level Architecture
The architecture separates three concerns: client sync, metadata management, and content storage.
What to say
I will separate metadata from content storage. Metadata needs strong consistency and goes in a database. Content is immutable chunks stored in object storage. The sync service coordinates between them.
Dropbox/Drive Architecture
Component Responsibilities:
1. Desktop/Mobile Client - Watches local file system for changes - Splits files into chunks, computes hashes - Maintains local database of file state - Handles offline queue of pending syncs
2. Sync Service - Orchestrates the sync protocol - Determines what needs to upload/download - Handles conflict detection and resolution - Manages sync cursors (where each client is)
3. Metadata Service - Stores file/folder hierarchy - Tracks which chunks compose each file - Manages versions and history - Handles sharing permissions
4. Chunk Service - Uploads chunks to object storage - Handles deduplication (check if chunk exists) - Generates pre-signed URLs for direct upload/download - Manages chunk lifecycle (garbage collection)
5. Notification Service - Pushes sync events to connected clients - Long-polling or WebSocket connections - Ensures clients know when to sync
Real-world reference
Dropbox uses this exact separation. They call it the block server (chunks) and metadata server. Google Drive similarly separates Drive metadata from Google Cloud Storage for content.
Data Model & Storage
The data model has three main entities: Users, Files (metadata), and Chunks (content).
What to say
The key insight is content-addressable storage. Chunks are identified by their hash, not by file ID. This enables global deduplication - if two users upload identical chunks, we store it once.
-- Users table
CREATE TABLE users (
user_id UUID PRIMARY KEY,Content-Addressable Storage:
Chunks are stored by their hash, not file ID:
Object Storage (S3) Structure:
/chunks/Chunking Algorithm:
Two approaches for splitting files:
| Approach | How It Works | Pros | Cons |
|---|---|---|---|
| Fixed-size chunks | Split every 4MB | Simple, predictable | Poor dedup if content shifts |
| Content-defined chunks | Split at content boundaries (Rabin fingerprint) | Better dedup, survives insertions | More complex, variable sizes |
def chunk_file(file_path: str) -> List[Chunk]:
# Content-defined chunking using rolling hash.
# Chunk boundaries based on content, not position.Why content-defined chunking?
If you insert data at the beginning of a file with fixed chunks, ALL chunks shift. With content-defined chunking, only the affected area changes. Dropbox uses a variant of Rabin fingerprinting for this reason.
Sync Protocol Deep Dive
The sync protocol is the heart of the system. It must handle: - Initial sync (new device) - Incremental sync (changes only) - Conflict resolution - Offline changes
What to say
The sync protocol uses a cursor-based approach. Each client maintains a cursor representing its last known state. On sync, it asks for all changes since its cursor.
Sync Protocol Flow
Cursor-Based Sync:
Each client maintains a cursor (like a bookmark) of its position in the change stream.
class SyncClient:
def __init__(self, user_id: str):
self.user_id = user_idDelta Sync (Efficient Updates):
When a file changes, only sync the changed chunks:
def sync_file_delta(local_file: str, remote_chunks: List[str]) -> SyncPlan:
# Compare local file to remote version.
# Determine minimal set of operations.Conflict Resolution
System Invariant
Never silently lose user data. If there is a conflict, preserve both versions. User can decide which to keep.
When conflicts happen:
Conflicts occur when the same file is modified on multiple devices without syncing in between.
Conflict Scenario
Conflict Detection:
Use vector clocks or version vectors to detect conflicts:
class VersionVector:
# Track version per device
def __init__(self):Conflict Resolution Strategies:
| Strategy | How It Works | Used By |
|---|---|---|
| Last-Write-Wins | Latest timestamp wins | Simple apps, some S3 modes |
| Conflicted Copies | Keep both as separate files | Dropbox, Google Drive |
| Three-Way Merge | Automatically merge if possible | Git, Google Docs |
| User Resolution | Show diff, user picks | IDEs, advanced tools |
def resolve_conflict(file_id: str, local_version: FileVersion,
remote_version: FileVersion) -> Resolution:
# Resolve conflict by creating a conflicted copy.Google Docs approach
Google Docs avoids conflicts entirely by using Operational Transform (OT). Every keystroke is an operation that can be transformed against concurrent operations. This enables real-time collaboration without conflicts, but only works for structured documents, not arbitrary binary files.
Failure Modes & Resilience
Proactively discuss failures
Let me walk through failure scenarios. For a file storage system, data durability is paramount.
| Failure | Impact | Mitigation | Why It Works |
|---|---|---|---|
| Upload interrupted | Incomplete file | Chunked upload with resume | Client retries from last successful chunk |
| Metadata DB down | Cannot sync | Read replicas + async failover | Reads continue, writes queue locally |
Data Durability:
File storage has extreme durability requirements. Users trust us with their only copy of important files.
Multiple layers of protection:
1. Client-side:Resumable Uploads:
class ResumableUpload:
def __init__(self, file_path: str, upload_id: str = None):
self.file_path = file_pathGarbage Collection
Chunks are reference-counted. When no file references a chunk (all files using it deleted), decrement ref_count. Chunks with ref_count=0 for more than 30 days are garbage collected. This delay handles edge cases like restore from trash.
Evolution & Scaling
What to say
This design works well up to hundreds of millions of users. Let me discuss how it evolves for global scale and what we would do differently for real-time collaboration.
Evolution Path:
Stage 1: Single Region (0-10M users) - Single metadata database (sharded by user_id) - Single object storage region - Simple polling-based sync
Stage 2: Multi-Region (10-100M users) - Metadata replicated across regions - Object storage in multiple regions - Smart routing (user to nearest region) - Cross-region sync for shared folders
Stage 3: Global Scale (100M+ users) - Edge caching for popular files - Predictive prefetch - Tiered storage (hot/warm/cold) - Machine learning for deduplication optimization
Multi-Region Architecture
Real-Time Collaboration (Google Docs extension):
If we wanted to add Google Docs-style real-time collaboration:
| Feature | Current Design | Real-Time Collab |
|---|---|---|
| Sync granularity | File level | Character/operation level |
| Conflict handling | Conflicted copies | Operational Transform |
| Latency | Seconds | Milliseconds |
| Protocol | Request/response | WebSocket streams |
| State | File versions | Document operations |
Current design: File-based sync
- User A and B edit same file
- Both upload different versionsAlternative approaches
If storage cost were the primary concern, I would add tiered storage - move files not accessed in 90 days to cheaper cold storage (Glacier). If sync speed were critical, I would add predictive prefetch - pre-download files likely to be accessed based on user patterns.
What I would monitor:
- Sync latency (p50, p99): How long until change appears on other devices - Upload success rate: Percentage of uploads completing without retry - Deduplication ratio: Percentage of chunks deduplicated (storage savings) - Conflict rate: Percentage of syncs resulting in conflicts - Chunk health: Periodic integrity verification of stored chunks