Parametrize
— Run the same test body against a table of inputs.@pytest.mark.parametrize("n,expected", [(0, 0), (1, 1), (5, 120)])def test_factorial(n, expected): assert factorial(n) == expectedThe test pyramid, the five test doubles everyone conflates, pytest idioms to know by heart, and the real causes of flakes.
What interviewers grade
| Level | Scope | Speed | Fragility | Role |
|---|---|---|---|---|
| Unit | One function / class; no I/O, no external services. | Milliseconds; run thousands per second. | Low — fail only when the code truly broke. | 80% of tests. Cover logic branches, edge cases, boundaries. |
| Integration | Multiple units together with real collaborators (DB, cache, queue). | Tens to hundreds of ms each. | Medium — schema drift, fixture data, timing. | 15%. Cover wiring: migration ran, ORM query works, txn boundaries correct. |
| End-to-end | Full system through the public entrypoint (HTTP, CLI). | Seconds to tens of seconds. | High — any cross-cutting change breaks them. | 5%. One happy path per critical user flow; more is a tax on velocity. |
Five distinct things that get called "mock" in casual conversation. Knowing the difference is a senior tell.
| Double | What it is | Use for | Avoid |
|---|---|---|---|
| Dummy | Placeholder that satisfies a signature; never called. | Filling a required constructor parameter that the test path does not exercise. | Using a dummy when the test actually invokes the collaborator — silent pass. |
| Stub | Returns canned responses to configured calls. No assertions. | Testing code that queries something — you control what comes back. | Re-implementing the real dependency; stubs should be dumb. |
| Spy | Records calls made to it; test later asserts what happened. | Verifying a logger or analytics call was emitted correctly. | Over-asserting on internal calls — couples tests to implementation. |
| Mock | Pre-programmed with expectations; verifies calls during or after the test. | Verifying behavior at a seam, e.g., "payment service was called once with these args." | Mocking objects you own. Mock at the boundary of your system, not inside it. |
| Fake | Working implementation, but simplified (in-memory DB, in-process queue). | Integration-ish tests without the real dependency — fast and deterministic. | Divergence from real behavior. Test the real thing at least once in CI. |
@pytest.mark.parametrize("n,expected", [(0, 0), (1, 1), (5, 120)])def test_factorial(n, expected): assert factorial(n) == expected@pytest.fixturedef db(): conn = make_conn() yield conn conn.close()@pytest.fixture(scope="session") # or module, class, functiondef app(): return create_app()def test_uses_env(monkeypatch): monkeypatch.setenv("FEATURE_X", "on") assert feature_enabled("x")def test_writes_file(tmp_path): p = tmp_path / "out.txt" write_report(p) assert p.read_text() == "ok"def test_rejects_negative(): with pytest.raises(ValueError, match="must be >= 0"): withdraw(-1)@pytest.mark.slowdef test_big_scan(): ...# Run fast tests only:# pytest -m "not slow"| Cause | Symptom | Fix |
|---|---|---|
| Time-dependent assertions | Test passes locally, fails on slow CI. "Expected 1.00, got 1.02." | Freeze time with `freezegun` or inject a clock. Compare with tolerance for timings. |
| Shared state between tests | Works alone; fails when another test runs first. Order-dependent. | Reset global/singleton state in teardown. Prefer pure functions and fresh fixtures. |
| Network / real external service | Fails when the network hiccups or the service is down for maintenance. | Fake / record-replay (VCR) for most tests. Real network only in a tagged slow suite. |
| Concurrency / races | Passes 99 times, fails 1 time. Harder on more cores. | Deterministic scheduling; inject a fake event loop; kill sleeps with explicit sync points. |
| Unseeded randomness | A rare input pattern fails once per week in CI. | Fix the seed in conftest.py. Record the failing input on assertion failure. |
| Filesystem / tmp collisions | Parallel test runs trip over each other on disk. | Use `tmp_path` (per-test dir). Never hard-code `/tmp/foo`. |
| Leaky test data | Today's test fails because yesterday's test left a row behind. | Rollback transaction per test; or truncate tables in teardown. Factories over fixtures for volume. |
Test behavior, not implementation.
Implementation-coupled tests break on every refactor without catching real bugs.
Mock at the boundary — not inside your domain.
Mocking internal collaborators produces tests that pass even when the code is broken.
Prefer fakes to mocks when you can build one.
Fakes let you write integration tests that exercise real interactions cheaply.
One logical assertion per test.
Failed test names should answer "what broke" in one line.
Given-When-Then structure the test body.
Readable failures and reviewable PRs.
Flaky tests must be fixed or deleted, not retried.
Retry-until-green teaches the team to distrust the suite. Trust is the point.
Fast feedback > total coverage.
A 30s suite that runs on every save beats a 10-min suite that runs nightly.
TDD — when to reach for it