B-Trees & B+ Trees
debt(d7/e5/b5/t7)
Closest to 'only careful code review or runtime testing' (d7) — EXPLAIN plans (db_explain_advanced) reveal index misuse, but it requires a human reading query plans; no automated linter flags bad index design or column ordering.
Closest to 'touches multiple files / significant refactor' (e5) — fixing index design (reordering composite columns, switching from UUID v4 to ordered IDs) involves schema migrations, possibly application code changes, and reindexing large tables, not a one-line patch.
Closest to 'persistent productivity tax' (b5) — index strategy applies across web/cli contexts and shapes query performance everywhere; poor choices (UUID PKs, over-indexing write-heavy tables) impose ongoing costs on many work streams.
Closest to 'serious trap' (t7) — the misconception conflates BSTs with B-trees, and the leftmost-prefix rule plus range-before-equality gotcha contradicts intuition; developers reasonably expect indexes to 'just work' regardless of column order.
Also Known As
TL;DR
Explanation
A B-Tree stores sorted keys in nodes with multiple children. B+ Trees (the standard in databases) store all data in leaf nodes; internal nodes only store keys for routing. Leaf nodes are linked — enabling efficient range scans. Properties: balanced (all leaves at same depth), fan-out of hundreds (each node holds many keys), O(log n) search/insert/delete. InnoDB, PostgreSQL, and most databases use B+ Trees for indexes. The high fan-out (order 100+) means even billion-row tables have trees only 3-4 levels deep.
Common Misconception
Why It Matters
Common Mistakes
- Adding indexes on low-cardinality columns — a boolean column has only 2 values; the B-tree provides no benefit.
- Not understanding that a full-table scan can be faster than an index for large result sets.
- UUID v4 primary keys causing B-tree fragmentation — random inserts split nodes constantly.
- Too many indexes on a write-heavy table — every insert/update must update all B-trees.
Code Examples
-- B-tree fragmented by UUID v4 primary key:
CREATE TABLE events (
id CHAR(36) PRIMARY KEY DEFAULT (UUID()), -- Random UUID
data JSON
);
-- Every INSERT goes to a random leaf position
-- Constant node splits and page fills
-- Index fragmentation: 70% over time
-- B-tree stays ordered with UUID v7 (time-sorted):
CREATE TABLE events (
id BINARY(16) PRIMARY KEY, -- UUID v7: timestamp-prefixed
data JSON
);
-- Inserts always append near the rightmost leaf
-- Minimal fragmentation
-- Same O(log n) lookup, better write performance
-- Check fragmentation:
SHOW TABLE STATUS LIKE events; -- Data_free shows fragmentation