Caching
Where to cache, which strategy to pick, and the standard pitfalls that show up the instant a cache goes to production.
How interviewers grade this
Everyone says "add a cache." The signal interviewers want is: which tier,
which strategy, what happens on stampede, and what the invalidation story
is. Lead with the access pattern, not the product name.
Cache strategies
| Strategy | Read flow | Write flow | When to use | Pitfall |
|---|---|---|---|---|
| Cache-aside (lazy) | app → cache hit? return : read DB, write cache, return | app → write DB, invalidate or update cache key | Default. Simple, resilient to cache failure. Most production systems start here. | Stale reads between DB write and cache invalidation. Thundering herd on hot-key expiry. |
| Write-through | app → cache (always hit after warm-up) | app → cache AND DB synchronously in one path | Read-heavy + no-tolerance for staleness. Config, feature flags, permission lookups. | Every write pays cache latency. Cache outage becomes a write outage unless you fail open. |
| Write-back (write-behind) | app → cache | app → cache; cache flushes to DB asynchronously in batches | High write volume, tolerable durability window. Analytics counters, page-view tallies. | Cache crash loses unflushed writes. Hardest consistency story; avoid for money / identity. |
| Write-around | Cache populated only by reads (as in cache-aside) | app → DB only; cache untouched | Writes that are rarely re-read soon (log ingest, audit trails). | First read after write is always a miss — bad if user just wrote and immediately rereads. |
| Refresh-ahead | cache returns current value; pre-fetches next value before TTL expires | Orthogonal — pairs with any write strategy | Predictable hot keys where a cold-miss spike is unacceptable (trending feeds). | Wasted work refreshing keys nobody asks for again; needs access-prediction signal. |
Where the cache lives (tiers)
Latency + capacity shape the decision. A 50 ms CDN hit beats a 1 ms Redis hit when the Redis hit is behind a 40 ms egress.
| Tier | Latency | Capacity | Example | Note |
|---|---|---|---|---|
| Browser / client | 0 ms | Tens of MB | HTTP Cache-Control, service worker, localStorage | Free. Biggest wins but hardest to invalidate — use short TTL + ETag. |
| CDN / edge | 10–50 ms | Tens of GB per POP | CloudFront, Fastly, Cloudflare | Static assets + public GETs. Cache key = URL + Vary headers. |
| Reverse proxy | 1–5 ms | GBs | Varnish, nginx proxy_cache | Good for server-side HTML fragments and API responses. |
| In-process | <100 µs | Hundreds of MB | Python dict + LRU, `functools.lru_cache`, caffeine (JVM) | Fastest. No network hop, but cold per-instance and inconsistent across pods. |
| Distributed cache | 0.5–2 ms | TBs | Redis, Memcached | Shared across instances. Often the canonical "the cache." |
| Materialized views | DB latency | Unbounded | Postgres matviews, ClickHouse projections | Cache inside the database; refresh sync vs async is the key knob. |
Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Cache stampede (dogpile) | Hot key expires; N concurrent requests all miss and hit the DB at once. | Single-flight (one loader at a time), probabilistic early expiration, or stale-while-revalidate. |
| Thundering herd on cold start | Cache flush or new deploy → all traffic goes to DB → DB falls over. | Warm cache before routing traffic; add jitter to TTLs so keys expire at staggered times. |
| Stale invalidation | DB updated but cache not invalidated → wrong value served indefinitely. | Write path must invalidate or update cache; prefer delete over update to avoid lost-update races. |
| Negative caching miss | Cache only stores hits; misses re-query DB every time. | Cache "not found" with short TTL to absorb missing-key storms (e.g., scrapers hitting /user/xyz). |
| Unbounded keys | Cache size explodes; eviction thrashes; hit rate plummets. | Bound cardinality (no per-user query keys without capacity plan); size + LRU eviction + monitoring. |
| Serialization cost | Cache hit but deserializing the blob is slower than re-querying a small DB. | Measure. Use compact formats (msgpack, protobuf). For tiny objects an in-process cache wins. |
The invalidation rule
There are only two hard things: cache invalidation, naming things, and off-by-one errors. For invalidation specifically — prefer delete-on-write over update-on-write; prefer short TTL over eternal cache; never trust a write-path to invalidate if it can crash mid-flight without compensating logic.