← Back to glossary

Database Partitioning

Performance PHP 5.0+ Advanced

debt(d7/e7/b7/t5)

d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). While tools like mysql-explain, pg-partman, and pganalyze can help identify partitioning opportunities or issues, detecting whether partitioning is misapplied (wrong partition key, unnecessary partitioning, missing partition key in primary key) requires careful analysis of query patterns and table sizes — not something caught automatically.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). Changing a partition key or adding partitioning to an existing large table requires restructuring the table, potentially migrating millions of rows, updating primary keys to include partition columns, and verifying all queries use the partition key. This is not architectural rework but significantly more than a single-component refactor.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (b7). Once a partitioning scheme is chosen, the partition key becomes a fundamental constraint — queries must filter on it for efficiency, primary keys must include it, and archival/retention policies depend on it. Every future schema change and query design must respect this choice. Applies to both web and cli contexts per applies_to.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap' (t5). The misconception clearly states developers confuse partitioning with sharding — a documented gotcha that most devs eventually learn. Additionally, common_mistakes show predictable pitfalls: assuming partitioning replaces indexing, or partitioning tables too small to benefit. These are traps but learnable ones, not contradictions of how similar concepts work.

About DEBT scoring → scored by claude-opus-4-5-20251101 · 2026-05-07 · reviewed by human

Also Known As

table partitioning horizontal partitioning range partitioning

TL;DR

Splitting a large table into physical segments by range, list, or hash — enabling partition pruning, faster archival, and parallel scans.

Explanation

Partitioning divides a single logical table into multiple physical storage segments. MySQL and PostgreSQL support: Range partitioning (by date range — perfect for time-series data and rolling retention), List partitioning (by discrete values — region or status), Hash partitioning (distributes rows evenly), and Composite forms. Benefits: partition pruning (a query on the last 7 days scans only recent partitions), fast archival (DROP PARTITION is instant vs DELETE on millions of rows), and parallel scan on multi-core hardware. Unlike sharding, partitioning is transparent to the application — all partitions appear as one table with no application-layer changes required.

Diagram

flowchart TD
    TABLE[(orders table<br/>100M rows)] --> PART{Partition by}
    subgraph Range_Partitioning
        PART -->|created_at| P2024[(2024 partition)]
        PART -->|created_at| P2025[(2025 partition)]
        PART -->|created_at| P2026[(2026 partition)]
    end
    subgraph List_Partitioning
        PART -->|country| PUK[(UK partition)]
        PART -->|country| PUS[(US partition)]
    end
    QUERY[SELECT WHERE created_at > 2026] -->|partition pruning| P2026
    QUERY -.->|skips| P2024
style TABLE fill:#f85149,color:#fff
style P2026 fill:#238636,color:#fff
style P2024 fill:#6e40c9,color:#fff
style QUERY fill:#1f6feb,color:#fff

Common Misconception

✗ Partitioning and sharding are the same technique. Partitioning divides a table within a single database server — the database engine manages it transparently. Sharding distributes data across multiple servers, requiring application-level routing logic.

Why It Matters

Partitioning splits a large table into smaller physical segments — queries that filter on the partition key only scan relevant partitions, and old partitions can be archived or dropped instantly.

Common Mistakes

Partitioning tables that are not actually large enough to benefit — adds complexity with no gain.
Choosing a partition key that most queries don't filter on — all partitions are scanned anyway.
Not including the partition key in the primary key — required in most databases for partitioned tables.
Assuming partitioning replaces indexing — indexes within partitions are still needed for non-partition-key filters.

Code Examples

✗ Vulnerable

-- Without partitioning: DELETE of old rows is slow and locks the table:
DELETE FROM events WHERE created_at < '2023-01-01'; -- Deletes millions of rows slowly

-- With range partitioning by month:
-- DROP PARTITION p_2023_01; -- Instant, no row-by-row deletion

✓ Fixed

-- PostgreSQL range partitioning by date — each partition is a separate table
CREATE TABLE events (
    id         BIGSERIAL,
    created_at TIMESTAMPTZ NOT NULL,
    payload    JSONB
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2024_q1 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
CREATE TABLE events_2024_q2 PARTITION OF events
    FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');

-- Queries filtered on created_at only scan the relevant partition
-- Old partitions can be detached and archived without touching live data
ALTER TABLE events DETACH PARTITION events_2024_q1;

References

https://dev.mysql.com/doc/refman/8.0/en/partitioning.html