Design Google Drive's file storage system

Google Drive (and Dropbox) is a senior system design question that tests a domain others don't: distributed file synchronisation. The challenges — chunking, delta sync, deduplication, conflict resolution — are unique to this problem class and require design patterns that don't appear in social media or messaging questions.

What the interview is really asking

The Google Drive question tests whether you understand the specific engineering challenges of building a file sync system that works correctly across millions of devices that may be offline, editing simultaneously, or on unreliable networks.

Chunking and delta sync. Uploading an entire 1GB file every time a user edits a spreadsheet is wasteful and slow. The fundamental insight behind Dropbox and Google Drive is chunking: divide files into fixed-size blocks, track which blocks changed, and upload only the deltas. The interviewer is checking whether you know this optimisation exists and can reason about the chunk size tradeoff (smaller chunks = better delta efficiency, more metadata; larger chunks = worse delta efficiency, less metadata).

Content-addressable storage and deduplication. Identify chunks by their hash, not by location. If the same chunk exists anywhere in the system, it is stored once. This enables deduplication across users and across files — a common document template or a popular video shared by multiple users has only one physical copy. The interviewer is checking that you understand content-addressable storage and can derive its deduplication benefits.

Metadata vs blob separation. File names, folder hierarchy, permissions, version history — this is structured metadata that must support arbitrary queries. File bytes are unstructured blobs. These have completely different storage and access patterns. The interviewer is checking that you separate them rather than proposing a single database for both.

Conflict resolution. Two users editing the same file simultaneously is inevitable in a collaborative system. The interviewer is checking that you have a specific conflict resolution strategy: versioning, conflict copies, operational transformation for real-time collaboration, or last-write-wins (with its explicit data loss trade-off).

Back-of-envelope estimation

Users and storage. Google Drive has ~1 billion users. Average storage per user: ~15 GB free tier. Total storage: 1B × 15 GB = 15 exabytes. Active file uploads: assume 1% of users upload a file per day. 10 million uploads/day × 2 MB average file size = 20 TB/day of new data.

Upload rate. 10 million uploads/day / 86,400 = ~116 file uploads/second. With chunking at 4MB/chunk: 2 MB average file = 0.5 chunks average. So ~58 chunk uploads/second at average. At peak (start of business day): 10× = 580 chunk uploads/second. Modest throughput — Google Drive is not a high-RPS system at the upload layer.

Download rate. Files are accessed more than they are uploaded. At 100:1 read ratio: 5,800 chunk reads/second. Serve from CDN: a popular shared document serves 99% of reads from cache. Only freshly modified files miss the CDN.

Metadata operations. Every file operation (open, rename, move, share, view) generates a metadata query. At 1 billion users × 5 operations/day / 86,400 = ~57,900 metadata reads/second. This is the dominant operation type. Metadata must be cached aggressively and sharded across a cluster.

Deduplication ratio. Dropbox has publicly stated 99%+ deduplication on their block store. At 20 TB/day of uploads, effective new unique storage is ~200 GB/day after dedup. Storage costs are dominated by the initial corpus, not incremental uploads.

Architecture decisions and why

Chunking: 4MB blocks with SHA-256 hashing. Each file is split into 4MB chunks. Each chunk is identified by its SHA-256 hash. The file is represented in metadata as an ordered list of chunk hashes (the chunk manifest). To upload a file: compute chunk hashes locally → send the manifest to the server → server responds with which chunk hashes it doesn't have → client uploads only the missing chunks. This is the check-before-upload protocol that makes delta sync possible. 4MB is the standard chunk size; smaller chunks (64KB) increase metadata overhead; larger chunks (16MB) reduce delta efficiency for small edits.

Content-addressable chunk store. Chunks are stored in object storage keyed by their SHA-256 hash. Two files sharing a chunk reference the same physical object — one object, two references. Cross-user deduplication happens automatically: if a user uploads a file that another user already uploaded (same content, same chunk hashes), no bytes are transferred. The user's metadata is updated to reference the existing chunks. This is why Dropbox could achieve 99% dedup — most files (photos, documents, videos) are shared or duplicated across users.

Metadata database: PostgreSQL with MVCC. The metadata schema: files (file_id, owner_id, name, parent_folder_id, current_version_id), versions (version_id, file_id, chunk_manifest, created_at, device_id), chunks (chunk_hash, size, storage_path). PostgreSQL's MVCC (multi-version concurrency control) is critical here: concurrent reads of a file's metadata don't block concurrent writes (a user viewing a file while another is uploading a new version). Version history is append-only — each upload creates a new version row, enabling version restoration without additional infrastructure.

Sync protocol: event-driven deltas. The sync client on each device maintains a local state (file tree + version manifest). Changes are detected via filesystem watchers (inotify on Linux, FSEvents on macOS). On a change: compute chunk diffs → check which chunks are missing on server → upload missing chunks → update metadata. The server pushes notifications to all connected devices when a file changes (via WebSocket long-poll) so other devices can pull the delta. The server never pushes file bytes to clients; clients pull on notification.

Conflict detection via revision vectors. Each file has a server revision number that increments on every write. When a client uploads a new version, it includes the revision number its edit was based on. The server compares: if base_revision == current_server_revision, the write is clean (only one side changed). If base_revision < current_server_revision, a conflict exists. Clean writes: commit and increment revision. Conflicts: create a conflict copy alongside the original, notify the user. Last-write-wins is not safe for a storage system — conflict copies preserve both versions at the cost of user resolution work.

Sharing and permissions. File sharing adds a permissions layer: each file has an ACL (access control list) mapping user_id or group_id to permission level (viewer, commenter, editor). Permissions checks occur on every metadata read and write. Caching permissions: store the ACL in Redis with a 60-second TTL. Permission changes propagate within one minute. For security-critical operations (deletion, share revocation), bypass the cache and read from the database directly.

Run it in the simulator

SysSimulator doesn't have a dedicated Google Drive blueprint, but the Distributed Systems architecture lets you model the key components. Configure a system with: API gateway, metadata service (backed by PostgreSQL), chunk upload service (with async queue), object storage, CDN, and a notification service (WebSocket).

Set upload traffic to 500 RPS (chunk uploads) and metadata traffic to 5,000 RPS (file listings, version history, permission checks). Observe: metadata service is the hot path; chunk upload service handles sustained but lower traffic.

Inject a metadata database partition. File uploads fail at the metadata commit step — chunks are uploaded to object storage but the manifest can't be saved. This is the dangerous partial-write scenario: chunks exist in storage but no file record points to them. They become orphaned objects. The mitigation: a garbage collection job runs daily to find chunks with no referencing metadata and deletes them, reclaiming storage.

Open SysSimulator →

Failure narration — word for word

"I'll inject a metadata service outage — the PostgreSQL cluster goes down while chunk uploads are still working."

"[inject] Chunks are still being accepted by the upload service and written to object storage — that path doesn't touch the metadata database. But when the upload completes and the client tries to commit the file manifest, it gets a 503. The file appears to upload but never becomes accessible to the user or other collaborators."

"The blast radius: new file uploads are stuck in a partially completed state. Existing files are readable from CDN/object storage — the metadata is cached in the application layer. Deletes, renames, and share operations fail because they require metadata writes."

"Recovery: when the metadata service comes back, clients retry the manifest commit (the chunk upload was idempotent — chunks already exist in object storage, only the manifest needs to be committed). Files that were partially uploaded become fully committed during the recovery window. Clients implement exponential backoff on metadata failures so recovery doesn't create a thundering herd on database restart."

The question behind the question

"What's the right chunk size?" Tradeoff: smaller chunks (64KB–1MB) = better delta sync efficiency (fewer bytes re-uploaded on small edits) but more chunk metadata, more IOPS per file, and higher dedup storage overhead per chunk. Larger chunks (8MB–16MB) = worse delta efficiency but simpler metadata and lower IOPS. 4MB is the empirical sweet spot, used by Dropbox. The interviewer is checking that you can reason through the tradeoff.

"How do you handle a 10GB file upload on a slow connection?" Resumable uploads: the client tracks which chunks have been successfully committed to the server. If the connection drops mid-upload, the client resumes from the last committed chunk, not from the beginning. The server maintains a "committed chunks" set per upload session. The client queries this set on reconnect and only re-uploads uncommitted chunks.

"How do you support Google Docs real-time collaboration?" This is a different system from file sync. Google Docs uses Operational Transformation (OT) or Conflict-free Replicated Data Types (CRDTs) to merge concurrent text edits in real time. OT maintains a transform function that converts concurrent edits into a canonical order. Changes are broadcast via WebSocket to all connected editors. The underlying file sync system is bypassed for real-time docs — the document is only persisted to Drive on save or periodically.

Frequently asked questions

Why does Google Drive chunk files?
Delta sync (re-upload only changed chunks, not entire file), deduplication (identical chunks across users share one physical copy), and parallel upload (multiple chunks upload concurrently). 4MB chunk size is the empirical optimum.

How does Google Drive handle sync conflicts?
Server compares the client's base revision to the current server revision. Clean merge (only one side changed): automatic. Conflict (both sides changed): server creates a conflict copy, preserving both versions. Users resolve manually. Real-time Google Docs collaboration uses OT/CRDTs instead.

How does Google Drive achieve deduplication?
Content-addressable storage: chunks identified by SHA-256 hash. Before uploading, client sends hashes to server. Server responds with which hashes it already has. Client uploads only missing chunks. Identical content across users shares one physical object.

What is the difference between metadata and blob storage?
Metadata (names, hierarchy, permissions, version history) is small and structured — PostgreSQL. Blobs (file bytes, chunks) are large and unstructured — object storage. Different access patterns require different storage systems.

How does Google Drive handle offline editing?
Client stores local file copies, tracks changes via filesystem watchers. On reconnect, computes which chunks changed, performs check-before-upload protocol, and resolves any server-side conflicts. Only delta chunks transfer — no full re-uploads.

Explore in SysSimulator →   Browse all blueprints

Next: Design typeahead search →