Caching

Where to cache, which strategy to pick, and the standard pitfalls that show up the instant a cache goes to production.

How interviewers grade this

Everyone says "add a cache." The signal interviewers want is: which tier, which strategy, what happens on stampede, and what the invalidation story is. Lead with the access pattern, not the product name.

Cache strategies

Strategy	Read flow	Write flow	When to use	Pitfall
Cache-aside (lazy)	app → cache hit? return : read DB, write cache, return	app → write DB, invalidate or update cache key	Default. Simple, resilient to cache failure. Most production systems start here.	Stale reads between DB write and cache invalidation. Thundering herd on hot-key expiry.
Write-through	app → cache (always hit after warm-up)	app → cache AND DB synchronously in one path	Read-heavy + no-tolerance for staleness. Config, feature flags, permission lookups.	Every write pays cache latency. Cache outage becomes a write outage unless you fail open.
Write-back (write-behind)	app → cache	app → cache; cache flushes to DB asynchronously in batches	High write volume, tolerable durability window. Analytics counters, page-view tallies.	Cache crash loses unflushed writes. Hardest consistency story; avoid for money / identity.
Write-around	Cache populated only by reads (as in cache-aside)	app → DB only; cache untouched	Writes that are rarely re-read soon (log ingest, audit trails).	First read after write is always a miss — bad if user just wrote and immediately rereads.
Refresh-ahead	cache returns current value; pre-fetches next value before TTL expires	Orthogonal — pairs with any write strategy	Predictable hot keys where a cold-miss spike is unacceptable (trending feeds).	Wasted work refreshing keys nobody asks for again; needs access-prediction signal.

Where the cache lives (tiers)

Latency + capacity shape the decision. A 50 ms CDN hit beats a 1 ms Redis hit when the Redis hit is behind a 40 ms egress.

Tier	Latency	Capacity	Example	Note
Browser / client	0 ms	Tens of MB	HTTP Cache-Control, service worker, localStorage	Free. Biggest wins but hardest to invalidate — use short TTL + ETag.
CDN / edge	10–50 ms	Tens of GB per POP	CloudFront, Fastly, Cloudflare	Static assets + public GETs. Cache key = URL + Vary headers.
Reverse proxy	1–5 ms	GBs	Varnish, nginx proxy_cache	Good for server-side HTML fragments and API responses.
In-process	<100 µs	Hundreds of MB	Python dict + LRU, `functools.lru_cache`, caffeine (JVM)	Fastest. No network hop, but cold per-instance and inconsistent across pods.
Distributed cache	0.5–2 ms	TBs	Redis, Memcached	Shared across instances. Often the canonical "the cache."
Materialized views	DB latency	Unbounded	Postgres matviews, ClickHouse projections	Cache inside the database; refresh sync vs async is the key knob.

Pitfalls

Pitfall	Symptom	Fix
Cache stampede (dogpile)	Hot key expires; N concurrent requests all miss and hit the DB at once.	Single-flight (one loader at a time), probabilistic early expiration, or stale-while-revalidate.
Thundering herd on cold start	Cache flush or new deploy → all traffic goes to DB → DB falls over.	Warm cache before routing traffic; add jitter to TTLs so keys expire at staggered times.
Stale invalidation	DB updated but cache not invalidated → wrong value served indefinitely.	Write path must invalidate or update cache; prefer delete over update to avoid lost-update races.
Negative caching miss	Cache only stores hits; misses re-query DB every time.	Cache "not found" with short TTL to absorb missing-key storms (e.g., scrapers hitting /user/xyz).
Unbounded keys	Cache size explodes; eviction thrashes; hit rate plummets.	Bound cardinality (no per-user query keys without capacity plan); size + LRU eviction + monitoring.
Serialization cost	Cache hit but deserializing the blob is slower than re-querying a small DB.	Measure. Use compact formats (msgpack, protobuf). For tiny objects an in-process cache wins.

The invalidation rule

There are only two hard things: cache invalidation, naming things, and off-by-one errors. For invalidation specifically — prefer delete-on-write over update-on-write; prefer short TTL over eternal cache; never trust a write-path to invalidate if it can crash mid-flight without compensating logic.