SQL
Statements
| Statement | Description | Example |
|---|---|---|
| WHERE | Filters rows based on a specified condition. | SELECT * FROM table_name WHERE condition_1; |
| AND | Combines multiple conditions in a WHERE clause, requiring all to be true. | SELECT * FROM table_name WHERE condition_1 AND condition_2; |
| OR | Combines multiple conditions in a WHERE clause, requiring at least one to be true. | SELECT * FROM table_name WHERE condition_1 OR condition_2; |
| LIMIT | Limits the number of rows returned by a query. | SELECT * FROM table_name LIMIT 10; |
| GROUP BY | Groups rows that have the same values into summary rows. | SELECT column_name, COUNT(*) FROM table_name GROUP BY column_name; |
| ORDER BY | Sorts the result set by specified columns (ASC or DESC). | SELECT * FROM table_name ORDER BY column_name ASC|DESC; |
| COUNT | Counts the number of rows in a specified column or all rows. | SELECT COUNT(*) FROM table_name; |
| AS | Aliases are used to rename a table or column in a SQL query result. | SELECT column_name AS alias_name FROM table_name; |
| INNER JOIN | Returns records that have matching values in both tables. | SELECT * FROM table1 INNER JOIN table2 ON table1.column_name = table2.column_name; |
| LEFT JOIN | Returns all records from the left table and the matched records from the right table. | SELECT * FROM table1 LEFT JOIN table2 ON table1.column_name = table2.column_name; |
| RIGHT JOIN | Returns all records from the right table and the matched records from the left table. | SELECT * FROM table1 RIGHT JOIN table2 ON table1.column_name = table2.column_name; |
| FULL JOIN | Returns all records when there is a match in either left or right table. | SELECT * FROM table1 FULL JOIN table2 ON table1.column_name = table2.column_name; |
Indexing
Indexes trade write speed and disk space for read speed. Every index makes INSERT / UPDATE slower and takes storage — so index for the queries you actually run, not hypothetical ones.
| Kind | Use for | Gotcha |
|---|---|---|
| B-tree (default) | Equality and range queries on ordered columns; most WHERE, ORDER BY, JOIN conditions | Leading-column rule: a composite index on (a, b, c) can serve queries on a, (a, b), or (a, b, c) — but NOT on b or c alone |
| Covering index | Queries whose SELECT list is fully included in the index — the DB never touches the table | Adding columns bloats the index; only cover when the query is hot |
| Partial index | "Only index active rows" / "only non-null rows" — shrinks the index and speeds writes | The WHERE predicate on the index must match the query predicate exactly |
| Hash index | Equality only, at O(1) — some engines (PG) have it, most default to B-tree | Useless for range queries or ORDER BY |
| GIN / inverted | Full-text search, JSONB, array containment | Write amplification — slower INSERTs; worth it only for search-heavy workloads |
| Unique index | Enforcing uniqueness + an index for lookups in one structure | On a nullable column, multiple NULLs are allowed in most engines (they compare unequal) |
Senior insight
Transaction isolation
Isolation levels trade consistency guarantees for concurrency. Higher levels prevent more anomalies but serialize more work.
| Level | Dirty read | Non-repeatable read | Phantom read | Note |
|---|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible | Rarely a good choice — you see uncommitted writes from other transactions |
| Read Committed | No | Possible | Possible | Default in most engines (PostgreSQL, Oracle). Fine for most OLTP workloads |
| Repeatable Read | No | No | Possible (in standard); PG implements snapshot isolation here, blocking phantoms too | Default in MySQL/InnoDB. Chosen when re-reading rows in a transaction must yield the same values |
| Serializable | No | No | No | Strongest. Cost: more aborts under contention (you must retry). Use for financial invariants |
The anomalies in one sentence each
- Dirty read: you see a value another transaction wrote and then rolled back.
- Non-repeatable read: you re-read a row and get a different value because someone committed between your reads.
- Phantom read: you re-run a range query and new rows appear that a concurrent transaction committed.
- Lost update: two transactions read, modify, and write the same row; one overwrite is silently lost. Fix with SELECT … FOR UPDATE or optimistic version columns.
The N+1 query problem
A parent query returns N rows, then code loops issuing one child query per row — 1 + N round-trips to the DB. Ruins latency on anything more than a handful of rows.
Signature: You see the same child query in your DB logs repeated with different parameters, once per parent row.
Fixes
- Join the child query into the parent (INNER / LEFT JOIN) when the result shape allows
- Use `WHERE id IN (…)` with the list of parent IDs, then group in code
- In ORMs, use eager-loading hints (`include`, `joinedload`, `preload`) — but verify with query logs; some ORMs still issue N queries under the hood
Why interviewers love this
Every backend engineer has shipped an N+1 at least once. Being able to describe the signature in DB logs (not just the abstract definition) shows you have actually debugged one, not just read about it.
Reading EXPLAIN
Signals to look for in a query plan, and the typical fix.
| Signal | Meaning | Fix |
|---|---|---|
| Seq Scan on a large table | The planner is reading every row — either no usable index or the optimizer chose a scan because the predicate hits most rows | Add an index on the predicate column; check statistics with `ANALYZE` |
| Rows estimate vs actual off by 10×+ | Stale statistics — the planner is making bad choices from outdated row counts | Run `ANALYZE` (PostgreSQL) or `ANALYZE TABLE` (MySQL); consider autovacuum tuning |
| Nested Loop with a large outer side | For each outer row, probing the inner — fine if outer is small, catastrophic otherwise | Hint or rewrite to encourage Hash Join or Merge Join; ensure join column is indexed on the inner side |
| Sort in memory or external merge | ORDER BY or DISTINCT forcing a sort that exceeds `work_mem` / equivalent | Add an index that already provides the order; raise memory limits per query only if needed |
| Filter removing most rows after the scan | The index (or lack of it) pulled in rows the WHERE clause then discards | Add a composite or partial index so the filter is applied by the index, not after |
Pagination strategies
| Strategy | Example | Pros | Cons |
|---|---|---|---|
| Offset / Limit | SELECT * FROM items ORDER BY id LIMIT 20 OFFSET 1000 | Trivial to implement; supports jumping to page N | O(offset) — page 1000 reads + discards 20 000 rows. Also non-stable under concurrent writes — items shift between pages |
| Cursor / Keyset | SELECT * FROM items WHERE id > :last_id ORDER BY id LIMIT 20 | O(limit) regardless of depth; stable under concurrent writes | No "jump to page N"; requires a strictly-ordered indexed column |
Rule of thumb