API design questions appear in system design interviews as either the primary focus ("design the API for this service") or a component decision ("what API protocol would you use between these services?"). The wrong answer is picking a protocol based on hype. The right answer maps the protocol's trade-offs to the specific requirements: client type, performance, schema flexibility, streaming needs, and caching requirements.
REST (Representational State Transfer) is HTTP with conventions: resources identified by URLs, operations expressed via HTTP methods (GET, POST, PUT, PATCH, DELETE), stateless requests, and response caching via HTTP cache headers. REST has no formal specification — it's a style, not a protocol — which means "REST API" can mean many things. In practice: JSON over HTTPS with meaningful URLs and HTTP status codes.
When REST is right. Public-facing APIs (Stripe, GitHub, Twilio are all REST). Browser-accessible APIs (no build step, works from any HTTP client). APIs with CRUD semantics that map naturally to resource URLs. APIs where HTTP caching (ETags, Cache-Control) can reduce server load. APIs consumed by third parties who need readable, self-documenting endpoints.
REST's weaknesses. Over-fetching: a REST endpoint returns a fixed response shape — clients that need only a subset of fields receive and parse unnecessary data. Under-fetching: getting related data requires multiple round trips (get user, then get user's orders, then get each order's items). This is the primary motivation for GraphQL. REST versioning is explicit (URL versioning) but creates parallel codepaths. REST is text-based (JSON) — not the most efficient encoding for high-frequency service calls.
GraphQL is a query language and runtime where the client specifies exactly what fields it wants in the response. A single /graphql endpoint handles all queries and mutations. The schema is strongly typed — every field, argument, and return type is defined in the schema. Introspection allows clients to discover available queries and types at runtime.
When GraphQL is right. Multiple client types with different data needs (mobile needs a compact response, desktop needs the full response — one GraphQL query per client, one server endpoint). Rapidly evolving frontends where adding new fields to a query is immediate, no server change needed. Complex relational data where avoiding N+1 queries via DataLoader gives performance benefits. Developer experience: GraphiQL/GraphQL Playground gives interactive query exploration.
GraphQL's weaknesses. The N+1 query problem (resolved with DataLoader, but requires implementation effort). Caching is harder — GET requests with query in URL are cacheable; POST with query in body (common practice) is not cacheable by CDNs. Rate limiting is harder — one query might resolve 1 field or 10,000 nested fields. Query complexity limits and depth limits must be implemented to prevent abuse. File uploads require multipart workarounds. GraphQL's flexibility can be a security surface — malformed queries, deeply nested queries, and introspection-based discovery of your schema.
The DataLoader requirement. Every production GraphQL server must implement DataLoader for any resolver that fetches from a database or external service. Without DataLoader, a query that returns 100 posts with their authors triggers 101 database queries (1 for posts + 100 for authors). With DataLoader: 2 queries (1 for posts, 1 batched author lookup). This is non-optional for correctness at scale.
gRPC is a high-performance RPC framework using Protocol Buffers (protobuf) for serialization and HTTP/2 as the transport. Protobuf is a binary encoding: ~3–5× smaller payload than JSON and ~5–10× faster to encode/decode. HTTP/2 multiplexes multiple requests over a single TCP connection and enables server streaming, client streaming, and bidirectional streaming natively.
When gRPC is right. Internal microservice communication where latency and throughput matter (100K+ calls/second between services). Streaming: real-time feeds, bidirectional chat, server push events. Polyglot systems: generate type-safe client and server code from a .proto schema for Go, Java, Python, Node.js, Rust simultaneously. Strong API contracts between teams — protobuf schema is the interface contract, violations cause compile-time errors in generated code.
gRPC's weaknesses. Browser support: gRPC uses HTTP/2 trailers which browsers don't expose — requires a grpc-web proxy layer (Envoy or grpc-web library). Not human-readable: binary protobuf is not debuggable with curl/Postman without special tooling. Tooling investment: every team needs protobuf setup and protoc code generation. For simple internal CRUD services, REST is less overhead. gRPC is the right choice when the performance benefit is measurable and the team is willing to invest in protobuf tooling.
URL versioning (/v1/users, /v2/users) is the industry standard for public APIs. Explicit, cacheable, easy to route. Stripe deprecated /v1/ to /v2/ across their entire API in 2024 — they maintained both for years. Commit to at least 12 months of parallel support when deprecating a version.
Header versioning (Accept: application/vnd.company.api+json; version=2) keeps URLs clean but is invisible in browser URLs and requires Vary: Accept for correct CDN caching. Preferred by some enterprise APIs (GitHub API uses Accept headers for version selection).
Additive-only (no versioning): only add fields, never remove or rename. Old clients ignore unknown fields. Only viable for internal APIs with controlled consumers and when you can coordinate deployments across all consumers. Not appropriate for public APIs.
Offset pagination (?page=3&limit=20): simple, allows jumping to arbitrary pages. Hidden problem: OFFSET 2000 LIMIT 20 requires the database to scan and discard 2000 rows before returning 20. Page 100 with 20 results per page scans 2000 rows — O(N) cost. Also: if rows are inserted or deleted between page fetches, items are skipped or duplicated. Not suitable for high-traffic APIs or large datasets.
Cursor-based pagination (?after=cursor_token&limit=20): the cursor encodes the position (e.g., base64 of the last seen primary key). Query: WHERE id > decoded_cursor LIMIT 20 — O(1) B-tree range scan, stable regardless of concurrent inserts/deletes. No skipped items. Trade-off: cannot jump to page N arbitrarily — only previous/next navigation. This is acceptable for most feed-style APIs (Twitter timeline, GitHub issue list, Stripe event history). Use cursor-based pagination for any API that will be called at scale.
When should you use gRPC instead of REST?
Internal service-to-service at high throughput, streaming, polyglot services with shared protobuf schema. Not for public APIs (browser support requires proxy) or simple CRUD where JSON readability matters more than performance.
What is the N+1 query problem in GraphQL?
Each resolver fetching one related entity per item causes N+1 database queries. Fix: DataLoader batches all lookups within one request tick into a single IN query. Non-optional for production GraphQL servers with relational data.
What are the REST API versioning strategies?
URL versioning (/v1/, /v2/) — standard for public APIs. Header versioning — clean URLs, harder to cache. Additive-only — internal APIs only. Commit to 12+ months of parallel version support before deprecation.
What is cursor-based pagination and why is it better than offset pagination?
Cursor encodes last-seen position; next page uses WHERE id > cursor. O(1) B-tree scan vs O(N) offset scan. No skipped items on concurrent writes. Trade-off: no random page access. Use cursor pagination for all high-traffic feed APIs.