System design interview framework

Q: How long should I spend on each phase of a system design interview?

For a 45-minute system design interview: (1) Requirements clarification: 5 minutes. Ask 3-5 focused questions, don't exceed 5 minutes — interviewers want to see design, not just questioning. (2) Back-of-envelope estimation: 5 minutes. Work through 2-3 key numbers (DAU, QPS, storage). State assumptions explicitly. (3) High-level design: 15 minutes. Draw the main components, explain the data flow, justify key decisions. Cover the end-to-end happy path. (4) Deep dive: 15 minutes. Go deep on 1-2 components the interviewer asks about or that are the most technically interesting. (5) Wrap-up and failure modes: 5 minutes. Discuss failure scenarios, monitoring, and how you'd iterate. Use a timer awareness during the interview — spending 20 minutes on requirements and never reaching deep dive is a common failure mode.

Q: What questions should I ask at the start of a system design interview?

Ask about: (1) Scale — 'What's the expected number of daily active users? What's the read/write ratio?' Scale drives nearly every architectural decision. (2) Functional scope — 'Should the design include authentication? Real-time features? Mobile clients?' Narrow the scope to what can be designed in 45 minutes. (3) Non-functional priorities — 'Is consistency more important than availability here? What's the acceptable latency for the critical path?' This helps you justify trade-offs later. (4) Data freshness — 'Is eventual consistency acceptable, or does this require strong consistency?' Avoid asking: 'What technology should I use?' — that's your decision. 'Is my design correct?' — that's the interviewer's evaluation, not a question. Keep clarifying questions concise and purposeful — 3-5 questions is ideal.

Q: How do you handle not knowing something in a system design interview?

State what you do know, reason from first principles, and identify the question explicitly. Example: if asked about consensus algorithms and you don't know Raft in detail: 'I know Kafka uses a leader-follower replication model with in-sync replicas and the controller handles leader election — I'd use Kafka here for that reason rather than implementing consensus from scratch.' This demonstrates practical knowledge even when deep theoretical knowledge is incomplete. What not to do: silently avoid the topic, make up details, or panic and lose your thread on the overall design. Interviewers at top companies are testing your reasoning process, not your encyclopedic knowledge. Knowing what you don't know, and working around it with practical alternatives, is a positive signal.

Q: What makes a system design answer stand out?

Four things separate strong answers from average ones: (1) Numbers — derive specific numbers (QPS, storage, latency) from the requirements rather than vague statements like 'it needs to be scalable.' Numbers anchor the design in reality. (2) Trade-off awareness — for every decision ('I'll use Cassandra for this'), name the trade-off ('which means no multi-row transactions, so I'll need to design for idempotency at the application layer'). (3) Failure scenarios — walk through what happens when each component fails. Average candidates design the happy path; strong candidates design for failure. (4) Prioritization — identify what matters most in this specific design. A payment system prioritizes consistency over availability; a social feed prioritizes availability over consistency. Applying the right priorities to the right system shows judgment.

A system design interview is 45–60 minutes of open-ended technical conversation. Without a framework, it's easy to spend the entire time on one component and never reach the depth the interviewer wanted to probe. This framework is used by candidates who pass system design rounds at Google, Meta, Amazon, and similar companies — not because it's magic, but because it creates structure that keeps you on track and signals to the interviewer that you know how to approach ambiguous engineering problems.

Step 1: clarify requirements (5 minutes)

Every system design problem is underspecified. The interviewer does this intentionally — they want to see how you handle ambiguity. Asking the right clarifying questions is itself a signal.

Ask about scale. "What's the expected DAU? What's the read/write ratio?" Scale is the single most important input to your architecture. A system for 10,000 users and a system for 100 million users have completely different designs. Don't guess — ask.

Narrow the functional scope. "Should the design include authentication? Mobile clients? Real-time updates?" You have 45 minutes — you cannot design a complete production system from scratch. Agree on a scope that's narrow enough to design well within the time constraint. If the interviewer says "let's focus on the core features," take that as signal and don't pad.

Clarify consistency and availability requirements. "Is eventual consistency acceptable, or does this require strong consistency?" "What's the acceptable latency for the critical path?" These answers directly drive technology choices and will justify your later trade-offs.

What not to ask. Don't ask "what technology should I use?" — that's your decision. Don't ask for permission to draw the design — just start. Don't spend more than 5 minutes on clarification. Interviewers want to see design, not just questioning.

Step 2: back-of-envelope estimation (5 minutes)

Derive the numbers that constrain your design. You should do 2–3 calculations, not 10. The purpose is to identify: which components will be under the most load, what storage scale you're dealing with, and whether any off-the-shelf component is near its capacity limit.

Work through: DAU → QPS (daily active users × actions per user per day / 86,400 seconds). Storage: (entities per day × bytes per entity × retention period). These two numbers tell you whether you need sharding, CDN, caching, or whether a single-server approach is adequate.

State your assumptions out loud: "I'm assuming 10% of users are active at any given time" or "I'm rounding up to 100,000 QPS for headroom." Interviewers want to see the reasoning process, not just the final number. Approximations are expected — you're estimating, not calculating. See the back-of-envelope estimation guide for the key numbers to memorize.

Step 3: high-level design (15 minutes)

Draw the major components and the data flow between them. This is the "architecture diagram" phase. You're designing the happy path — what happens when everything works correctly.

Start with the client and end with the storage. Client → Load Balancer → API Server → [business logic components] → Database/Cache. Trace a single user request through the system, explaining each component's role and the data flow. This forces you to think about the full end-to-end path, not just individual components in isolation.

Justify key choices as you draw. "I'll use PostgreSQL here because we need ACID transactions for payment records." "I'll put Redis in front of the database because 90% of these reads are for the same small set of hot data." Don't just draw boxes — explain the reasoning. The interviewer is evaluating your judgment, not your diagram.

Cover the data model early. What are the key entities? How are they related? What are the primary access patterns? The data model often drives the architecture (a time-series access pattern suggests Cassandra; a relational access pattern suggests PostgreSQL). Naming your main tables and their primary keys signals experience.

Don't over-engineer at this stage. The high-level design is the foundation — resist adding complexity (sharding, multi-region, circuit breakers) before establishing the baseline. Add complexity in the deep dive when it's justified.

Step 4: deep dive (15 minutes)

The interviewer will probe 1–2 components. Often they'll ask: "How does [component X] work in detail?" or "What happens when [component X] fails?" This is where technical depth separates strong candidates from average ones.

Go deep on the component the interviewer asks about. If they ask about the database layer: discuss indexing strategy, sharding approach, read replica lag handling. If they ask about the cache layer: discuss cache invalidation strategy, cache stampede prevention, eviction policy. Match the depth to what the interviewer is probing — don't give a shallow overview when they're asking for depth.

If the interviewer doesn't direct the deep dive. Choose the most technically interesting or challenging component of your design — the part that has the most constraints or where your earlier decisions have the most consequences. Walk through it in detail: the data structures involved, the failure modes, the trade-offs of your approach vs. alternatives.

Bring up trade-offs proactively. "I chose Redis for caching here, which means I need to handle cache invalidation — here's how I'd approach that..." Proactively addressing the downsides of your design choices signals confidence and technical maturity. Interviewers know every design has trade-offs — the question is whether you know them too.

Step 5: failure scenarios and wrap-up (5 minutes)

Average candidates design the happy path. Strong candidates design for failure. Use the last 5 minutes to walk through how your system handles failures.

What happens when the database goes down? If it's a primary-replica setup: auto-failover to replica, with ~30-60 seconds downtime during promotion. Reads continue from replica during primary failure. Writes queue or fail until the new primary is elected.

What happens when a cache node fails? Cache misses increase — traffic that was hitting the cache now hits the database. If the database can't absorb this traffic spike, you have a thundering herd. Mitigation: probabilistic early expiration, staggered TTLs, circuit breaker on the database.

How do you monitor this system? Key metrics: request latency (p50, p99), error rate, cache hit rate, queue depth (for async pipelines), database connection pool utilization. Alerts on: error rate >1%, p99 latency >500ms, queue depth growing without draining.

Ending with a thoughtful failure discussion — even if brief — creates a strong closing impression. It signals that you think about production operations, not just initial design.

Common failure modes in interviews

Spending too long on requirements. 15 minutes of clarification and no design is a failed interview. Set a mental timer — 5 minutes max on clarification, then start drawing.

Jumping to solutions before understanding scale. "I'll use Kafka" — why? What's the QPS? Does this use case require Kafka's throughput, or is a simpler queue sufficient? Always derive scale first, then justify your technology choices against the scale you've established.

Designing in silence. Talk continuously. The interviewer cannot evaluate reasoning they can't hear. Say "I'm choosing X over Y because Z" every time you make a decision. Interviewers who can't follow your reasoning will assume you don't have any.

Not asking for feedback. "Does this cover what you're looking for, or should I go deeper on any component?" Checking in with the interviewer midway through shows awareness that this is a collaborative conversation, not a monologue. It also lets you redirect if you've missed the interviewer's focus area.

Frequently asked questions

How long should I spend on each phase of a system design interview?
Requirements: 5 min. Estimation: 5 min. High-level design: 15 min. Deep dive: 15 min. Failure/wrap-up: 5 min. The most common error is spending 20+ minutes on requirements and never reaching the deep dive.

What questions should I ask at the start of a system design interview?
Ask about scale (DAU, QPS, read/write ratio), functional scope (what features to include/exclude), and consistency requirements. 3–5 questions max. Don't ask what technology to use — that's your decision.

How do you handle not knowing something in a system design interview?
State what you do know, reason from first principles, and name the gap explicitly. Pivot to a practical alternative you do know. Interviewers test reasoning process, not encyclopedic knowledge.

What makes a system design answer stand out?
Specific numbers derived from requirements. Trade-off awareness for every decision. Failure scenario coverage beyond the happy path. Prioritization that matches the system's actual requirements (consistency vs. availability, latency vs. throughput).

Practice with SysSimulator → Estimation cheat sheet

Next: Back-of-envelope estimation →