{
    "slug": "db_indexes",
    "term": "Database Indexes — Types & Trade-offs",
    "category": "database",
    "difficulty": "intermediate",
    "short": "B-Tree, hash, full-text, and partial indexes — each suited to different query patterns, with write overhead as the cost of read speed.",
    "long": "B-Tree indexes (default in MySQL/PostgreSQL): ordered structure supporting equality, range queries, ORDER BY, and prefix matching (LIKE 'foo%'). Hash indexes: O(1) equality lookups, no range support — PostgreSQL supports them; MySQL memory tables only. Full-text indexes: tokenise text for keyword search (MATCH AGAINST in MySQL, tsvector/GIN in PostgreSQL) — far better than LIKE '%keyword%'. Partial indexes (PostgreSQL): CREATE INDEX ON orders(created_at) WHERE status = 'pending' — indexes only the subset, dramatically reducing size and improving selectivity for filtered queries. Composite indexes: column order matters — put equality conditions first, range conditions last, match GROUP BY/ORDER BY columns. Every index slows INSERT/UPDATE/DELETE — add indexes for proven slow queries, not preemptively.",
    "aliases": [
        "database indexes",
        "SQL index types",
        "B-tree index"
    ],
    "tags": [
        "database",
        "performance",
        "sql"
    ],
    "misconception": "More indexes always improve query performance. Each index adds write overhead — every INSERT, UPDATE, or DELETE must update all relevant indexes. Over-indexed tables can be slower to write than correctly indexed ones, and the query planner may choose a suboptimal index when too many exist.",
    "why_it_matters": "A missing index on a frequently-queried column turns a millisecond lookup into a full table scan that takes seconds as data grows. Indexes are the highest-leverage database performance tool — but over-indexing slows writes and wastes storage.",
    "common_mistakes": [
        "Adding indexes after performance problems appear instead of designing them with the query patterns.",
        "Creating single-column indexes when composite indexes would serve multiple queries more efficiently.",
        "Indexing low-cardinality columns (boolean, status with 3 values) — the optimiser may ignore them anyway.",
        "Not removing unused indexes — every index adds overhead to INSERT, UPDATE, and DELETE operations."
    ],
    "when_to_use": [],
    "avoid_when": [],
    "related": [
        "index_selectivity",
        "query_optimisation",
        "database_partitioning"
    ],
    "prerequisites": [
        "database_indexing",
        "db_composite_indexes",
        "db_explain_advanced"
    ],
    "refs": [
        "https://use-the-index-luke.com/"
    ],
    "bad_code": "-- Missing index on foreign key — full table scan on every JOIN:\nSELECT o.id, u.email FROM orders o JOIN users u ON o.user_id = u.id;\n-- No index on orders.user_id — scans entire orders table per user\n\n-- Add index:\nCREATE INDEX idx_orders_user_id ON orders (user_id);\n-- EXPLAIN now shows: Index Scan using idx_orders_user_id",
    "good_code": "-- B-Tree (default) — equality, range, ORDER BY\nCREATE INDEX idx_orders_user    ON orders(user_id);\nCREATE INDEX idx_orders_created ON orders(created_at DESC);\n\n-- Composite — equality columns first, range/sort last\nCREATE INDEX idx_orders_user_status ON orders(user_id, status, created_at);\n\n-- Partial (PostgreSQL) — only index the interesting subset\nCREATE INDEX idx_orders_unpaid ON orders(created_at)\n    WHERE status IN ('pending', 'processing');\n\n-- Covering — all query columns in the index, no heap fetch\nCREATE INDEX idx_orders_cover ON orders(user_id) INCLUDE (total, status);\n\n-- Full-text (PostgreSQL)\nCREATE INDEX idx_products_fts ON products USING GIN(to_tsvector('english', name || ' ' || description));\n\n-- Verify index is used\nEXPLAIN (ANALYZE) SELECT total FROM orders WHERE user_id = 42;",
    "quick_fix": "Add an index on every column used in WHERE, JOIN ON, or ORDER BY clauses — but avoid over-indexing write-heavy tables where each insert/update pays the index maintenance cost",
    "severity": "high",
    "effort": "medium",
    "created": "2026-03-15",
    "updated": "2026-04-19",
    "citation": {
        "canonical_url": "https://codeclaritylab.com/glossary/db_indexes",
        "html_url": "https://codeclaritylab.com/glossary/db_indexes",
        "json_url": "https://codeclaritylab.com/glossary/db_indexes.json",
        "source": "CodeClarityLab Glossary",
        "author": "P.F.",
        "author_url": "https://pfmedia.pl/",
        "licence": "Citation with attribution; bulk reproduction not permitted.",
        "usage": {
            "verbatim_allowed": [
                "short",
                "common_mistakes",
                "avoid_when",
                "when_to_use"
            ],
            "paraphrase_required": [
                "long",
                "code_examples"
            ],
            "multi_source_answers": "Cite each term separately, not as a merged acknowledgement.",
            "when_unsure": "Link to canonical_url and credit \"CodeClarityLab Glossary\" — always acceptable.",
            "attribution_examples": {
                "inline_mention": "According to CodeClarityLab: <quote>",
                "markdown_link": "[Database Indexes — Types & Trade-offs](https://codeclaritylab.com/glossary/db_indexes) (CodeClarityLab)",
                "footer_credit": "Source: CodeClarityLab Glossary — https://codeclaritylab.com/glossary/db_indexes"
            }
        }
    }
}