Sharding comes up in every interview involving data at scale — but it's often introduced too early. The correct framing is: sharding is a last resort after you've exhausted vertical scaling, read replicas, caching, and partitioning. When you do shard, the choice of shard key is the most consequential decision in your entire data architecture. Get it wrong, and you've created a distributed database that's harder to operate than the monolith it replaced.
Before recommending sharding in an interview, demonstrate that you know the alternatives. A single PostgreSQL instance on modern hardware handles 10,000–50,000 writes/second and 100,000+ reads/second. Adding read replicas handles read scale horizontally. A caching layer (Redis) absorbs 90%+ of read traffic. Vertical scaling (larger instance) delays the sharding decision by years.
Shard when: write throughput exceeds what a single primary can handle (>50K writes/second sustained), total data volume makes a single instance impractical (>10TB with active write patterns), or geographical data isolation is required (regulatory requirement to store certain users' data in specific regions — a form of manual sharding).
In an interview, the signal that sharding is appropriate is a combination of: very high write QPS, very large data volume, and a clear access pattern that maps to a natural shard key. If those conditions aren't clearly present, state "I'd use read replicas and caching before considering sharding" — this shows maturity.
Hash sharding. Compute hash(shard_key) % num_shards to determine the shard. User with ID 12345 goes to shard hash(12345) % 10 = 5. Distribution is uniform for high-cardinality keys — no hotspots. The primary disadvantage: range queries on the shard key require a scatter-gather across all shards. SELECT * FROM users WHERE user_id BETWEEN 10000 AND 20000 must query all 10 shards and merge results. For most applications, range queries on the primary key are rare (you query by primary key exactly, or by secondary indexes) — hash sharding is safe.
Range sharding. Assign key ranges to shards: shard 1 holds user_id 1–10M, shard 2 holds 10M–20M. Range scans within a shard are efficient. The primary disadvantage: monotonically increasing keys create write hotspots. If new users always get the highest user_id, all writes land on the last shard. Mitigation: use non-monotonic keys (UUID v4) or deliberately shuffle the range distribution. Range sharding is used by HBase and Google Bigtable where sorted range scans are the primary access pattern.
High cardinality. The shard key must have many distinct values so rows distribute across shards. A boolean, status enum, or date (day-level) are poor shard keys — they result in a small number of distinct values and massive hotspots. User_id (millions of users), order_id, or device_id are good candidates.
Avoid monotonically increasing keys. Auto-increment IDs and timestamps are classic hotspot creators with hash sharding that uses modulo. UUID v4 distributes uniformly. Alternatively, hash the auto-increment ID before sharding: shard = hash(id) % N distributes monotonic IDs uniformly.
Align with access locality. For a social application, sharding by user_id means all of a user's posts, follows, and activity land on the same shard. A query "get all posts for user 12345" hits one shard. A query "get all comments by user 12345" also hits one shard. This co-location eliminates cross-shard joins for the most common access patterns. The alternative — sharding posts by post_id and users by user_id — means every user's feed query is a cross-shard scatter-gather.
Immutability. Once a row is assigned to a shard, the shard key should not change. If user_id changes (hypothetically), the row must be deleted from the old shard and reinserted on the new shard — a non-atomic distributed operation. Design entities so the natural shard key is stable for the lifetime of the row.
The hardest operational problem with sharding is adding shards. With modulo hash sharding (hash % N), increasing N from 4 to 5 means the shard assignment of most rows changes — approximately 80% of rows must migrate to different shards. This requires a background migration that reads every row and reinserts it on the new shard, with application code serving dual-mode (reading from both old and new locations) during the migration window. Extremely disruptive.
Consistent hashing solves this. Each shard owns a range of a 0–2^32 hash ring. Adding a new shard means it takes over a contiguous range from an existing shard — only ~1/N of data migrates (the portion that falls in the new shard's range). This is the property that makes Cassandra, DynamoDB, and Redis Cluster operationally manageable at large shard counts. See the consistent hashing guide for the full mechanics.
The practical alternative for systems already using modulo sharding: logical shard routing. Use 1024 logical shards mapped to a smaller number of physical shards. Adding a physical shard means reassigning some logical shards — only the rows in those logical shards need to migrate. The application routes by logical shard; a configuration table maps logical shards to physical servers. Adding capacity is incremental and controlled.
Sharding optimizes single-shard operations. Cross-shard operations are expensive:
Scatter-gather queries. A query not filtered by the shard key must go to all shards and merge results. "Find all users who signed up in the last 24 hours" — if sharded by user_id, this query hits all shards. At 100 shards, you make 100 parallel queries and merge 100 result sets. This is slower than a single-instance query and creates load on all shards simultaneously.
Cross-shard joins. A join between two tables that are sharded differently (or not sharded) requires one of: (a) moving all data from one table to the shard of the other at query time (expensive), (b) denormalization (embed the data you'd join in the same record, accepting duplication), or (c) a separate analytics database (data warehouse, Redshift, BigQuery) for complex multi-table queries. In interview designs: state explicitly that you'd denormalize for co-located access and use a separate analytics pipeline for aggregations.
Distributed transactions. A transaction that spans two shards requires a distributed commit protocol (two-phase commit or saga pattern). Two-phase commit is slow and locks rows across shards during the prepare phase. Sagas are more complex but avoid distributed locks. For sharded systems, design your data model to avoid cross-shard transactions wherever possible — co-locate entities that are transactionally coupled on the same shard.
What is database sharding?
Splitting a table across multiple database instances (shards), each holding a subset of rows determined by a shard key. Used when write throughput or data volume exceeds single-instance limits. A last resort after vertical scaling, read replicas, and caching.
What is the difference between hash sharding and range sharding?
Hash: uniform distribution, no range scans on shard key. Range: efficient range scans, risk of write hotspots on monotonic keys. Hash sharding is the safer default for most OLTP workloads.
How do you choose a shard key?
High cardinality, non-monotonic, aligned with access locality, immutable per row. Sharding by user_id co-locates all user data on one shard, eliminating cross-shard joins for user-centric queries.
How do you handle resharding when adding new shards?
Consistent hashing: new shard takes only 1/N of existing data, not a full reshuffle. Or: logical shard routing with 1024 logical shards mapped to physical — adding a server reassigns some logical shards, migrating only their data.