APIs

REST Architectural Constraints

Name	Description
Client-Server (Decoupling)	Separation of concerns between the client and server, allowing them to evolve independently and improve scalability, reliability, and portability.
Stateless	Each request from the client to the server must contain all the information necessary to understand and fulfill the request, meaning no client context is stored on the server between requests.
Cacheability	Responses must define themselves as cacheable or non-cacheable, improving efficiency, scalability, and user-perceived performance through the use of caching.
Uniform Interface	A uniform and standardized way of interacting with resources through well-defined operations (HTTP methods) and resource representations (media types).
Layered System	A hierarchical system where intermediaries (proxies, gateways, etc.) can be used to improve scalability, security, and encapsulation by providing additional layers of abstraction.
Code-On-Demand (Optional)	Servers can temporarily extend the functionality of a client by transferring executable code (e.g., JavaScript) within a response, enhancing flexibility and reducing client-server coupling.

Note

REST = REpresentational State Transfer

REST Methods

Method	CRUD Operation
POST	Create
GET	Read
PUT	Update
PATCH	Specific-Update
DELETE	Delete

GraphQL

Name	Description
Schema	Defines the structure of the data in the GraphQL API, including types, fields, and relationships.
Query	Defines how clients can fetch data from the GraphQL server. Queries are used to retrieve data from the server.
Mutation	Defines how clients can modify data on the GraphQL server. Mutations are used to create, update, or delete data.
Subscription	Defines how clients can subscribe to real-time data updates from the GraphQL server. Subscriptions allow clients to receive data as it changes.
Resolver	Functions that define how GraphQL fields are resolved. Resolvers are responsible for fetching the data associated with each field.
Scalar	Primitive data types in GraphQL that represent single values, such as integers, strings, booleans, and floats.
Type	Composite data types in GraphQL that represent complex objects with multiple fields. Object types define the structure of the data returned by queries and mutations.
Input	Similar to object types, input types represent complex input data for mutations. Input types define the structure of the data that clients can provide when executing mutations.
Enum	An enumeration type in GraphQL that represents a predefined set of possible values. Enums are used to define a specific domain of valid options for a field.
Union	A type in GraphQL that represents a combination of one or more object types. Unions allow for flexibility in query responses by allowing a field to return different types of objects.
Interface	A type in GraphQL that defines a common set of fields that multiple object types can implement. Interfaces enable polymorphism and ensure consistent field structures across related types.

REST vs GraphQL

Criterion	REST	GraphQL
Shape of response	Fixed per endpoint	Client-specified — only the fields asked for
Over / under-fetching	Common — endpoints return more or less than the UI needs	Minimized — client asks for exactly what it renders
Caching	Free via HTTP (ETag, Cache-Control, CDN)	Requires app-level or persisted-query caching
N+1 on the server	Per-endpoint control — rare	Common pitfall — mitigate with DataLoader / batching
Learning curve	Low — just HTTP verbs	Higher — schema, resolvers, batching
Best for	Public APIs, resource-centric CRUD, CDN-cacheable reads	Complex UIs with varied data shapes; mobile over slow networks

Rule of thumb: REST by default. Reach for GraphQL when multiple clients need differently-shaped views of the same data and the team can own the batching / caching complexity.

Safety & idempotency by method

Safe = no server-side state change. Idempotent = same request sent twice has the same effect as once. These two properties drive caching, retry logic, and CDN behavior — get them wrong and clients silently double-charge, duplicate orders, or leak updates.

Method	Safe?	Idempotent?	Note
GET	Yes	Yes	Must not mutate state. Caches rely on this
HEAD	Yes	Yes	Metadata only — same semantics as GET minus body
OPTIONS	Yes	Yes	Pre-flight / capability discovery
PUT	No	Yes	Replace at a known URL — PUT-ing twice yields the same state
DELETE	No	Yes	After the first delete, subsequent deletes are no-ops (or 404 — still idempotent-equivalent)
POST	No	No	Two POSTs = two resources. Use an Idempotency-Key header to fix this for payment-like endpoints
PATCH	No	Not guaranteed	Can be idempotent (full-replace patches) or not (counter increments). State which in your API docs

Idempotency keys for POST

Client generates a UUID → sends it as `Idempotency-Key: <uuid>` header → server stores (key → response) with a TTL → duplicate requests return the cached response.

Invariants

Key TTL must be at least as long as the longest plausible client retry window (hours, not seconds)
Key space must be scoped to (tenant, endpoint) to prevent collisions across customers
Response cache must include the HTTP status code, not just the body
Payload-hash check: if the same key arrives with a DIFFERENT body, return 422 — the client has a bug

Why this matters

Every payment, booking, and "send" button on the internet has this requirement. Talking through the TTL choice, the payload-hash check, and the cached status-code is what separates the senior answer from the textbook one.

OAuth 2.0 grant types

Pick the grant that fits the client. OAuth 2.1 consolidates on PKCE-backed flows and retires the legacy ones.

Grant	Use case	Flow	Status
Authorization Code + PKCE	Web apps, mobile apps, SPAs — the modern default	User → authorize → code → exchange for token with PKCE verifier	Recommended
Client Credentials	Machine-to-machine — no user involved	Client authenticates with its own credentials, gets a token	Recommended
Device Code	Input-constrained devices (TVs, CLI tools)	Device shows a code → user enters it on another device → device polls for token	Recommended
Refresh Token	Renew access tokens without re-prompting the user	Token response includes a refresh_token; exchange it for a new access_token	Recommended (with rotation)
Implicit	Historically SPAs — supplanted by Authorization Code + PKCE	Redirects return the token directly in the URL fragment	Discouraged (OAuth 2.1)
Resource Owner Password	Legacy: first-party clients exchanging username/password for a token	Client posts credentials directly to the token endpoint	Deprecated

Rate limiting: where to put it

The algorithm (token bucket, sliding window, etc.) is a separate decision — placement is where interviews probe design sense. In practice you layer: coarse at the edge, finer as you move inward.

Where	Good at	Weakness
Edge / CDN	Cheapest layer to drop abusive traffic; protects origin bandwidth	Coarse — typically per-IP; cannot see app-level identity
API gateway	Can key on API key / tenant; central config; protects all services uniformly	Single point of policy — subtle bugs affect everything
Application	Full request context; per-endpoint or per-user limits	Each service reimplements the same logic; hot-path overhead if not cached

See the Patterns page

The rate-limit algorithm trade-offs (token bucket vs leaky bucket vs sliding window) will live on the System Design page — algorithms are a separate concern from placement.

Sessions vs JWT

The question is not "is JWT cool" — it is "do I need stateless verification across services, or do I need instant revocation and tiny payloads?" Those point to different answers.

Aspect	Session cookie	JWT token
Storage	Session id in an HttpOnly cookie; state on server.	Signed claims on the client; server is stateless.
Revocation	Instant — delete server-side session.	Hard — wait for expiry or maintain a blocklist.
Horizontal scale	Needs shared session store (Redis) or sticky sessions.	Any node can verify independently — no shared state.
Cross-origin / mobile	Cookie semantics can be painful across domains; CSRF risk.	Simple: add Authorization: Bearer header.
CSRF	Needs explicit defense (SameSite, CSRF token).	Not applicable if sent via Authorization header (not a cookie).
Payload size	Tiny — just a session id.	Grows with claims; often 500 B–2 KB per request.

Rule of thumb

Default to HttpOnly session cookies for first-party web apps. Reach for JWT when you need stateless verification across many services, mobile clients, or federated auth.

JWT pitfalls

Every one of these has a CVE behind it. Interviewers who have been burned test for them.

Pitfall	Risk	Mitigation
alg=none	Server accepts tokens with `"alg": "none"` — any attacker can forge an admin token.	Allowlist accepted algorithms explicitly. Reject `none`. Pin the alg server-side.
HS256 vs RS256 confusion	Public key treated as HMAC secret; attacker signs HS256 tokens with the public key.	Bind the verification key to the expected algorithm. Never accept a JWT whose alg does not match your config.
No expiry or long expiry	Stolen token is valid for days; no way to revoke without rebuilding everything.	Short-lived access tokens (5–15 min). Pair with refresh tokens + rotation.
No revocation story	User logs out / is disabled, but the token still validates until expiry.	Short expiry + refresh rotation; or token version in claim checked against DB on sensitive ops.
Claims put on the client	Role claim in the JWT trusted blindly; attacker replays an old, privileged token.	Treat JWT as authn (who), not authz (what). Re-check permissions server-side for sensitive ops.
Big payloads in JWT	Every request carries 4 KB of claims; headers blow up, edge caches refuse them.	Keep JWT to essential identity claims. Store the rest server-side, keyed by user id.
JWT in localStorage	Any XSS = full token theft, because localStorage is readable from JS.	HttpOnly + Secure + SameSite cookie. Or server-side session with short JWT-like claim.

Pagination

Four modes. Most teams pick offset because it is easy, then get surprised at page 10,000 or when concurrent inserts scramble results.

Mode	How it works	Best for	Failure mode
Offset / LIMIT+OFFSET	`?page=5&size=20` → `LIMIT 20 OFFSET 100`.	Small, mostly-static datasets. Admin tables.	Slow at large offsets (DB scans all skipped rows). Duplicates / skips on concurrent inserts.
Keyset / cursor	`?after=<last_id>&size=20` → `WHERE id > ? ORDER BY id LIMIT 20`.	Large feeds; infinite scroll; append-heavy data.	Cannot jump to page N; cursor must be derived from a unique sorted column.
Seek (composite key)	`WHERE (created_at, id) > (?, ?) ORDER BY created_at, id`.	Sorted by a non-unique column (timestamp). Stable across ties.	Index must cover all sort columns, in order.
Opaque cursor	Server returns a signed/encoded cursor; client treats it as opaque.	Public APIs — lets you change pagination internals without breaking clients.	Clients cannot jump or inspect; caching becomes harder.

Webhooks — delivery practices

Sign every webhook with HMAC; include a timestamp.

Subscribers must be able to verify origin and reject replays beyond a small skew window.
Deliver at-least-once; document idempotency.

Subscribers will get duplicates — they need a key to dedupe on, and you need to document the retry policy.
Retry with exponential backoff + jitter.

Failed endpoints recover; backoff avoids hammering. Jitter avoids thundering herd at retry time.
Deliver asynchronously — never block the business transaction on webhook delivery.

A slow or down subscriber should never delay the user flow. Use an outbox + worker.
Expose a dashboard: last delivery, status, payload, retry button.

Subscribers debug their side without filing a ticket. Essential for integrations at scale.
Allow subscribers to configure which events they care about.

Reduces load on both sides and avoids leaking unrelated event types.

Resilience knobs

Timeouts, retries, circuit breakers, bulkheads. Every synchronous RPC call in a high-traffic path needs most of these — missing any one is a well-documented outage pattern.

Knob	Purpose	Typical value	Common mistake
Timeout	Bound how long a call may take.	1–5s for synchronous RPC; 30–60s for rare heavy calls.	No timeout → one slow dep exhausts every thread/connection in the caller.
Retry	Recover from transient failure.	2–3 attempts, exponential backoff, jitter, cap on total time.	Retrying non-idempotent writes without an idempotency key — creates duplicates.
Circuit breaker	Stop calling a failing dependency; fail fast until it recovers.	Open after 50% error rate over N requests; half-open after 30s.	Too eager opening → blip becomes outage. Too slow → dep takes you down.
Bulkhead	Partition resources per downstream so one failing dep cannot exhaust shared pools.	Separate connection pool / thread pool / semaphore per dep.	One global pool → slow dep A takes down calls to healthy dep B.
Rate limit (client-side)	Do not overwhelm a dep that is recovering or has its own limits.	Token bucket at 80% of dep's declared limit.	Ignoring 429 responses; retrying immediately amplifies the overload.
Hedged request	Fire a second request after p95 latency; take whichever responds first.	Only for idempotent reads with tight latency SLO.	Doubles load on the slow path; combine with tight timeouts.

The retry rule

Retry is only safe on idempotent calls OR calls that carry an idempotency key. Before adding a retry, ask: if this fires twice, is the outcome the same? If not, fix that first.