System Design Lite

The 30-minute whiteboard scenarios that appear in nearly every senior SWE loop. Each one names the invariants, not a vendor product.

How to use this page

Cover the Approach section and see if you can derive it from the Problem statement and the Clarifying Questions. The Follow-ups are what interviewers probe once you have an initial answer — be ready for all of them before you consider a scenario "done".

LRU cache

Time O(1) get / put  ·  Space O(capacity)

Problem

Cache with a fixed capacity that evicts the least-recently-used entry on insert when full. `get` and `put` must be O(1).

Clarify first

  • Is capacity fixed at construction or dynamic?
  • Is access (get) thread-safe? If yes, coarse lock or lock-free?
  • Are we optimizing for hit latency (in-memory) or hit rate (tiered cache)?

Approach

Hash map (key → node) gives O(1) lookup. Doubly-linked list keeps entries in recency order — move-to-head on access, drop-from-tail on eviction. The two structures share the same node objects so both pointers move in O(1).

Reference implementation

class Node:
__slots__ = ('key', 'val', 'prev', 'next')
def __init__(self, k, v):
self.key, self.val, self.prev, self.next = k, v, None, None
class LRU:
def __init__(self, cap: int):
self.cap = cap
self.map: dict[int, Node] = {}
# Sentinel head/tail — removes null-checks
self.head, self.tail = Node(0, 0), Node(0, 0)
self.head.next, self.tail.prev = self.tail, self.head
def _remove(self, n: Node) -> None:
n.prev.next, n.next.prev = n.next, n.prev
def _add_to_front(self, n: Node) -> None:
n.prev, n.next = self.head, self.head.next
self.head.next.prev = n
self.head.next = n
def get(self, key: int) -> int:
n = self.map.get(key)
if not n:
return -1
self._remove(n); self._add_to_front(n)
return n.val
def put(self, key: int, val: int) -> None:
if key in self.map:
self._remove(self.map[key])
n = Node(key, val)
self.map[key] = n
self._add_to_front(n)
if len(self.map) > self.cap:
lru = self.tail.prev
self._remove(lru)
del self.map[lru.key]

Follow-up questions

  • ? Make it thread-safe. What concurrency primitive is cheapest — a single lock, striped locks, or lock-free?
  • ? Implement LFU instead. Why is it harder to get O(1)?
  • ? How would you extend this to a distributed cache?

LFU cache (vs LRU)

Time O(1) get / put  ·  Space O(capacity)

Problem

Evict the least-frequently-used entry. On tie, break by recency.

Clarify first

  • Is this a pure interview question or are we benchmarking? LFU hit-rate is often worse than LRU on realistic workloads
  • How often do items age out? A pure frequency count never decays, so old popular items can stick forever

Approach

Hash map key → node, plus a hash map from frequency → doubly-linked list of nodes at that frequency. Track the current minimum frequency; on access, bump the node to the next frequency list. Eviction pops from the head of the min-frequency list.

Follow-up questions

  • ? What is "frequency decay" and why do production LFU variants (TinyLFU, W-TinyLFU) use it?
  • ? Why do many systems use LRU even when LFU hit rates are technically higher?

Rate limiter

Time O(1) per request  ·  Space O(keys × window_granularity)

Problem

Reject requests that exceed a quota (e.g., 100 req / minute / user). Should work across many server instances.

Clarify first

  • Per-user, per-IP, per-API-key, or per-tenant?
  • Soft (429 + retry-after) or hard (drop the TCP connection)?
  • Is approximate OK, or do we need exact? Distributed exact-counting is expensive

Approach

Pick an algorithm (see table below). Store counters in a shared fast store (Redis with atomic INCR + EXPIRE is the canonical implementation). For distributed accuracy, use token bucket with atomic Lua scripts or sliding-window-counter for memory efficiency.

Rate limiter algorithms

Criterion Algorithm Behavior
Fixed window Counter per window (e.g., minute) Edge case: burst at window boundary can be 2× the intended limit
Sliding log Store a timestamp per request in a sorted set Exact, but memory scales with request volume — expensive at scale
Sliding window counter Two adjacent fixed-window counts, weighted by time-in-current-window Good accuracy with O(1) memory per key — the pragmatic default
Token bucket Bucket refills at a rate; each request consumes a token Allows short bursts up to bucket size — user-friendly for interactive APIs
Leaky bucket Requests queue; drained at a fixed rate Smooths output but adds latency — fits traffic shaping more than rate limiting

Rule of thumb: Start with sliding-window counter for general API rate limiting. Use token bucket when short bursts are a feature. Avoid fixed window for anything public-facing.

Follow-up questions

  • ? What happens when Redis is unreachable? Fail-open or fail-closed?
  • ? How do you handle clock skew across instances?
  • ? What if one user generates 99% of traffic? (Hot-key problem)

Bloom filter

Time O(k) per op  ·  Space O(m) bits — typically ~9.6 bits per element for 1% FPR

Problem

Probabilistic set membership — "is X probably in the set?" with O(1) queries and ~1 byte per element. Never false-negatives; may false-positive at a configurable rate.

Clarify first

  • What false-positive rate is acceptable? 1%? 0.01%?
  • Do we need to remove elements? Standard Bloom filters cannot — you need a Counting Bloom Filter

Approach

Bit array of size m. On insert, hash the element with k independent hash functions and set those k bits. On query, if any of the k bits is 0 → definitely not in set; if all are 1 → probably in set. Tune m and k for target false-positive rate: m = -n·ln(p) / (ln 2)² for n elements and false-positive rate p.

Follow-up questions

  • ? Where is this used in real systems?
  • ? Counting Bloom Filter vs Cuckoo Filter — when does each win?
  • ? Why do LSM-tree databases use Bloom filters per SSTable?

Consistent hashing

Time O(log N) lookup with a sorted ring + binary search  ·  Space O(N × virtual_count)

Problem

Distribute keys across N nodes. When a node is added or removed, move as few keys as possible — ideally ~1/N of them.

Clarify first

  • How many nodes and how many keys? (Affects load balance)
  • Is node membership static or changing frequently?
  • Do we need replication? Usually yes — store at the next K nodes on the ring

Approach

Hash each node onto a ring (0 to 2^32 - 1). Hash each key onto the same ring; the key belongs to the first node clockwise. Adding / removing a node only moves the keys in that node's ring segment. Use virtual nodes (many ring positions per real node, e.g., 150×) to smooth out load imbalance.

Follow-up questions

  • ? What if the hash ring has one node with 10× the keys? (Virtual nodes fix this)
  • ? Why does mod-N hashing (`hash(key) % N`) fall apart on rescale?
  • ? Rendezvous hashing as an alternative — when does it win?

Distributed ID generation

Time O(1) per ID  ·  Space O(1) state per worker

Problem

Generate unique IDs across many machines without coordination. Which ID shape fits the use case?

Clarify first

  • Must IDs be sortable by creation time?
  • Is 128-bit OK, or do we need 64-bit for DB index efficiency?
  • Must we avoid leaking cardinality (guessable IDs are an anti-pattern for public resources)?

Approach

Choose by shape, not by product name: • **Random 128-bit (UUID v4)** — trivial to generate, no coordination, not sortable. Bad for B-tree primary keys (random inserts shred the index). • **Time-ordered 128-bit (UUID v7 / ULID)** — first 48 bits are a millisecond timestamp, rest is random. Sortable, index-friendly, no coordination. • **Time-ordered 64-bit with machine id (Snowflake-shape)** — timestamp + datacenter/worker id + sequence. Requires a unique worker-id assignment process. Fits in a bigint PK.

Follow-up questions

  • ? How do you assign worker IDs at scale without a coordinator?
  • ? What happens to Snowflake-shape IDs when the clock goes backwards (NTP correction, leap smear)?
  • ? Why would you NOT use a random-UUID as the primary key of a large InnoDB table?

Messaging primitives — pick the right shape

Time Producer O(1) + broker round-trip  ·  Space Retention-dependent

Problem

An async workflow needs to hand off work between services. Which messaging shape fits?

Clarify first

  • Exactly once? At least once? At most once? (Exactly-once is almost always a lie — aim for at-least-once + idempotent consumers)
  • Do consumers need to replay old messages?
  • One consumer per message, or fan-out?

Approach

Three shapes, anchored on semantics (products in parens are examples, not prescriptions): • **Append-only log** — ordered, replayable, consumer-tracks-offset. Multiple independent consumers read at their own pace. Good for event sourcing, analytics fan-out (Kafka, Kinesis, Pulsar). • **Work queue** — each message goes to one consumer; ack / nack on completion; visibility timeout protects against dead workers. Good for tasks (RabbitMQ work queues, SQS, Beanstalkd). • **Pub/sub broadcast** — every subscriber gets every message; no durable offset. Good for live notifications (Redis Pub/Sub, SNS). Do not assume the named product behaves exactly like the shape — features drift. Read the docs of whichever one you actually pick.

Follow-up questions

  • ? How do you achieve idempotent consumers on top of at-least-once delivery?
  • ? Why is "exactly-once" usually misleading marketing?
  • ? When do you need a dead-letter queue, and how do you configure it?

Cache strategies

Time O(1) cached read; O(db) on miss  ·  Space Bounded by cache capacity

Problem

A hot read path is hitting the database too hard. Where and how do you cache?

Clarify first

  • Is the underlying data changing, and how stale can the cache be?
  • Is write volume high? (Write-through has a write tax; cache-aside does not)
  • Can two consumers read the same key concurrently? (Cache stampede)

Approach

**Cache-aside** (most common): app checks cache → miss → reads DB → writes cache. Simple, but subject to stampede and needs TTLs. **Write-through**: writes go to cache AND DB atomically; reads always hit cache. No staleness, but write path pays the cache cost. **Write-back**: writes go to cache; DB is updated asynchronously. Fastest writes; worst durability. **Write-around**: writes skip cache and go straight to DB; only reads populate cache. Good when writes are seldom re-read soon after.

Follow-up questions

  • ? What is a cache stampede and how do single-flight / probabilistic early expiration fix it?
  • ? Thundering herd after a cache flush — how do you warm it safely?
  • ? Why can TTL-based eviction cause cache-miss spikes? How do you smooth them?

Sharding strategies

Time O(1) with known shard key  ·  Space Proportional to shard

Problem

A single database cannot hold the data or handle the load. How do you split it across many?

Clarify first

  • What is the query pattern — single-entity lookups, range scans, or analytics?
  • What is the natural partition key (user_id, tenant_id, region)?
  • How often do you need cross-shard queries? (Expensive — design to avoid)

Approach

**Range sharding** — partition by key range (ids 0–1M on shard A, 1M–2M on B). Good for range scans; creates hotspots if ranges are unbalanced or monotonic (e.g., timestamp → last shard gets all writes). **Hash sharding** — partition by hash(key) mod N. Balances load evenly; destroys range-scan locality. Rescaling is painful unless you use consistent hashing. **Directory sharding** — a lookup service maps key → shard. Maximum flexibility; the directory becomes a bottleneck and must itself be scaled.

Follow-up questions

  • ? How do you reshard without downtime?
  • ? What is a "hot shard" and what do you do when one user generates 20% of traffic?
  • ? Cross-shard JOIN: when to denormalize vs fan-out query vs move to a warehouse?

Leader election

Time O(n) messages per election (candidate contacts all peers); O(n) per commit for majority replication  ·  Space O(log) persisted per node

Problem

A set of processes must elect exactly one leader. The leader coordinates; followers replicate. What goes wrong, and how do you make it safe?

Clarify first

  • Is strong consistency required, or is eventual convergence OK?
  • What happens if the network splits? (Split-brain is the classic failure mode)
  • How quickly must a new leader take over after a failure?

Approach

**Use a proven consensus protocol** (Raft, Paxos, Zab) or a coordinator (etcd, ZooKeeper, Consul). Do not hand-roll. Key guarantee: at most one leader per term, and any decision a leader makes is replicated to a majority quorum before commit. For a quick interview answer, describe Raft: (1) each process has a term counter; (2) followers promote to candidate on timeout and request votes; (3) a candidate with a majority becomes leader; (4) the leader heartbeats; (5) partitioned leaders step down when they cannot reach a quorum. The "bully algorithm" is the textbook simple version but does not handle partitions safely.

Follow-up questions

  • ? Why does split-brain lead to two writers? How does a quorum prevent it?
  • ? Lease-based leader election — what does the lease protect against?
  • ? Why does "odd number of nodes" matter for quorum (3, 5, 7)?