{
    "slug": "database_indexes",
    "term": "Database Indexes",
    "category": "database",
    "difficulty": "intermediate",
    "short": "Data structures that allow the database engine to find rows matching a condition without scanning every row — the single most impactful performance optimisation available for read-heavy PHP applications.",
    "long": "A database index is a separate data structure (typically a B-tree) maintained alongside a table that maps column values to row locations. Without an index, every query on a column requires a full table scan — every row is read regardless of how many match. With an index, the engine traverses a balanced tree in O(log n) time to find matching rows directly. Index types: B-tree (default, works for equality, range, ORDER BY, and LIKE 'prefix%'); hash (equality only, faster for exact lookups); composite (covers multiple columns — column order matters); covering (includes all columns a query needs, avoiding a table lookup entirely); partial (indexes a subset of rows based on a condition). MySQL's EXPLAIN and PostgreSQL's EXPLAIN ANALYZE show whether queries use indexes and where they scan. The most common PHP performance issue is a missing index on a frequently-queried foreign key or WHERE clause column.",
    "aliases": [
        "index",
        "database index",
        "B-tree index",
        "composite index",
        "covering index",
        "MySQL index",
        "PostgreSQL index"
    ],
    "tags": [
        "database",
        "performance",
        "indexes",
        "mysql",
        "postgresql",
        "query-optimization",
        "b-tree"
    ],
    "misconception": "Adding more indexes always improves performance. Indexes speed up reads but slow down writes — every INSERT, UPDATE, and DELETE must update all indexes on the table. A table with 15 indexes on a write-heavy workload performs worse than one with 3 targeted indexes. Index each column that appears in WHERE, JOIN ON, or ORDER BY clauses in frequent queries, and no more. Regularly audit unused indexes with sys.schema_unused_indexes (MySQL) or pg_stat_user_indexes (PostgreSQL).",
    "why_it_matters": "A missing index on a table with 1 million rows turns a 1ms query into a 2-second full table scan. PHP applications with ORM-generated queries frequently query on foreign keys, user IDs, or created_at timestamps without indexes because ORMs create tables without inferring query patterns. EXPLAIN on a slow query reveals 'type: ALL' (MySQL) or 'Seq Scan' (PostgreSQL) — both mean full table scan, both fixed by adding the right index. Correctly indexed queries typically run 10–1000× faster than unindexed equivalents.",
    "common_mistakes": [
        "Missing indexes on foreign key columns — ORMs create foreign key constraints but do not always create indexes; add indexes manually on every FK column.",
        "Using LIKE '%keyword%' and expecting an index — a leading wildcard disables B-tree index usage; use FULLTEXT indexes for text search.",
        "Over-indexing write-heavy tables — each index adds overhead on every write; audit and remove unused indexes.",
        "Wrong column order in composite indexes — a composite index on (a, b) helps queries on a and on (a, b) but not queries on b alone; put the highest-cardinality column first."
    ],
    "when_to_use": [
        "Index every foreign key and any column that appears in WHERE, JOIN ON, or ORDER BY clauses in frequent queries.",
        "Use composite indexes when queries filter on multiple columns together — column order matters; put the most selective column first.",
        "Use covering indexes to eliminate table lookups when a query selects only indexed columns."
    ],
    "avoid_when": [
        "Avoid indexing columns with very low cardinality (boolean, status with 2–3 values) — the planner often skips them in favour of a full scan.",
        "Do not add indexes speculatively — each index slows INSERT, UPDATE, and DELETE and consumes storage. Add them in response to measured slow queries.",
        "Avoid over-indexing write-heavy tables — a table with 15 indexes on a high-insert workload will bottleneck on index maintenance."
    ],
    "related": [
        "query_optimization",
        "database_connection_pool",
        "inverted_index"
    ],
    "prerequisites": [],
    "refs": [
        "https://dev.mysql.com/doc/refman/8.0/en/optimization-indexes.html"
    ],
    "bad_code": "-- No index on user_id — full table scan on every order lookup\nCREATE TABLE orders (\n    id         INT PRIMARY KEY,\n    user_id    INT,          -- missing index\n    status     VARCHAR(20),  -- missing index\n    created_at DATETIME\n);\n\n-- EXPLAIN shows: type=ALL, rows=500000 — scans entire table",
    "good_code": "-- Indexed foreign key + composite for common query pattern\nCREATE TABLE orders (\n    id         INT PRIMARY KEY,\n    user_id    INT NOT NULL,\n    status     VARCHAR(20) NOT NULL,\n    created_at DATETIME NOT NULL,\n    INDEX idx_user_id (user_id),\n    INDEX idx_status_created (status, created_at)  -- composite for ORDER queries\n);\n\n-- EXPLAIN now shows: type=ref, rows=12 — uses index\n-- In Laravel migration:\n-- $table->index('user_id');\n-- $table->index(['status', 'created_at']);",
    "example_note": "The bad schema has no index on user_id in orders — every lookup scans the full table. The fix adds an index on user_id and a composite index matching the most common query pattern.",
    "quick_fix": "Run EXPLAIN on slow queries — 'type: ALL' or 'Seq Scan' means add an index. Add: CREATE INDEX idx_name ON table(column) — or in migrations: $table->index('column')",
    "severity": "high",
    "effort": "low",
    "created": "2026-03-23",
    "updated": "2026-03-31",
    "citation": {
        "canonical_url": "https://codeclaritylab.com/glossary/database_indexes",
        "html_url": "https://codeclaritylab.com/glossary/database_indexes",
        "json_url": "https://codeclaritylab.com/glossary/database_indexes.json",
        "source": "CodeClarityLab Glossary",
        "author": "P.F.",
        "author_url": "https://pfmedia.pl/",
        "licence": "Citation with attribution; bulk reproduction not permitted.",
        "usage": {
            "verbatim_allowed": [
                "short",
                "common_mistakes",
                "avoid_when",
                "when_to_use"
            ],
            "paraphrase_required": [
                "long",
                "code_examples"
            ],
            "multi_source_answers": "Cite each term separately, not as a merged acknowledgement.",
            "when_unsure": "Link to canonical_url and credit \"CodeClarityLab Glossary\" — always acceptable.",
            "attribution_examples": {
                "inline_mention": "According to CodeClarityLab: <quote>",
                "markdown_link": "[Database Indexes](https://codeclaritylab.com/glossary/database_indexes) (CodeClarityLab)",
                "footer_credit": "Source: CodeClarityLab Glossary — https://codeclaritylab.com/glossary/database_indexes"
            }
        }
    }
}