Caching

Where to cache, which strategy to pick, and the standard pitfalls that show up the instant a cache goes to production.

How interviewers grade this

Everyone says "add a cache." The signal interviewers want is: which tier, which strategy, what happens on stampede, and what the invalidation story is. Lead with the access pattern, not the product name.

Cache strategies

StrategyRead flowWrite flowWhen to usePitfall
Cache-aside (lazy) app → cache hit? return : read DB, write cache, return app → write DB, invalidate or update cache key Default. Simple, resilient to cache failure. Most production systems start here. Stale reads between DB write and cache invalidation. Thundering herd on hot-key expiry.
Write-through app → cache (always hit after warm-up) app → cache AND DB synchronously in one path Read-heavy + no-tolerance for staleness. Config, feature flags, permission lookups. Every write pays cache latency. Cache outage becomes a write outage unless you fail open.
Write-back (write-behind) app → cache app → cache; cache flushes to DB asynchronously in batches High write volume, tolerable durability window. Analytics counters, page-view tallies. Cache crash loses unflushed writes. Hardest consistency story; avoid for money / identity.
Write-around Cache populated only by reads (as in cache-aside) app → DB only; cache untouched Writes that are rarely re-read soon (log ingest, audit trails). First read after write is always a miss — bad if user just wrote and immediately rereads.
Refresh-ahead cache returns current value; pre-fetches next value before TTL expires Orthogonal — pairs with any write strategy Predictable hot keys where a cold-miss spike is unacceptable (trending feeds). Wasted work refreshing keys nobody asks for again; needs access-prediction signal.

Where the cache lives (tiers)

Latency + capacity shape the decision. A 50 ms CDN hit beats a 1 ms Redis hit when the Redis hit is behind a 40 ms egress.

TierLatencyCapacityExampleNote
Browser / client 0 ms Tens of MB HTTP Cache-Control, service worker, localStorage Free. Biggest wins but hardest to invalidate — use short TTL + ETag.
CDN / edge 10–50 ms Tens of GB per POP CloudFront, Fastly, Cloudflare Static assets + public GETs. Cache key = URL + Vary headers.
Reverse proxy 1–5 ms GBs Varnish, nginx proxy_cache Good for server-side HTML fragments and API responses.
In-process <100 µs Hundreds of MB Python dict + LRU, `functools.lru_cache`, caffeine (JVM) Fastest. No network hop, but cold per-instance and inconsistent across pods.
Distributed cache 0.5–2 ms TBs Redis, Memcached Shared across instances. Often the canonical "the cache."
Materialized views DB latency Unbounded Postgres matviews, ClickHouse projections Cache inside the database; refresh sync vs async is the key knob.

Pitfalls

PitfallSymptomFix
Cache stampede (dogpile) Hot key expires; N concurrent requests all miss and hit the DB at once. Single-flight (one loader at a time), probabilistic early expiration, or stale-while-revalidate.
Thundering herd on cold start Cache flush or new deploy → all traffic goes to DB → DB falls over. Warm cache before routing traffic; add jitter to TTLs so keys expire at staggered times.
Stale invalidation DB updated but cache not invalidated → wrong value served indefinitely. Write path must invalidate or update cache; prefer delete over update to avoid lost-update races.
Negative caching miss Cache only stores hits; misses re-query DB every time. Cache "not found" with short TTL to absorb missing-key storms (e.g., scrapers hitting /user/xyz).
Unbounded keys Cache size explodes; eviction thrashes; hit rate plummets. Bound cardinality (no per-user query keys without capacity plan); size + LRU eviction + monitoring.
Serialization cost Cache hit but deserializing the blob is slower than re-querying a small DB. Measure. Use compact formats (msgpack, protobuf). For tiny objects an in-process cache wins.

The invalidation rule

There are only two hard things: cache invalidation, naming things, and off-by-one errors. For invalidation specifically — prefer delete-on-write over update-on-write; prefer short TTL over eternal cache; never trust a write-path to invalidate if it can crash mid-flight without compensating logic.