Database Partitioning
debt(d7/e7/b7/t5)
Closest to 'only careful code review or runtime testing' (d7). While tools like mysql-explain, pg-partman, and pganalyze can help identify partitioning opportunities or issues, detecting whether partitioning is misapplied (wrong partition key, unnecessary partitioning, missing partition key in primary key) requires careful analysis of query patterns and table sizes — not something caught automatically.
Closest to 'cross-cutting refactor across the codebase' (e7). Changing a partition key or adding partitioning to an existing large table requires restructuring the table, potentially migrating millions of rows, updating primary keys to include partition columns, and verifying all queries use the partition key. This is not architectural rework but significantly more than a single-component refactor.
Closest to 'strong gravitational pull' (b7). Once a partitioning scheme is chosen, the partition key becomes a fundamental constraint — queries must filter on it for efficiency, primary keys must include it, and archival/retention policies depend on it. Every future schema change and query design must respect this choice. Applies to both web and cli contexts per applies_to.
Closest to 'notable trap' (t5). The misconception clearly states developers confuse partitioning with sharding — a documented gotcha that most devs eventually learn. Additionally, common_mistakes show predictable pitfalls: assuming partitioning replaces indexing, or partitioning tables too small to benefit. These are traps but learnable ones, not contradictions of how similar concepts work.
Also Known As
TL;DR
Explanation
Partitioning divides a single logical table into multiple physical storage segments. MySQL and PostgreSQL support: Range partitioning (by date range — perfect for time-series data and rolling retention), List partitioning (by discrete values — region or status), Hash partitioning (distributes rows evenly), and Composite forms. Benefits: partition pruning (a query on the last 7 days scans only recent partitions), fast archival (DROP PARTITION is instant vs DELETE on millions of rows), and parallel scan on multi-core hardware. Unlike sharding, partitioning is transparent to the application — all partitions appear as one table with no application-layer changes required.
Diagram
flowchart TD
TABLE[(orders table<br/>100M rows)] --> PART{Partition by}
subgraph Range_Partitioning
PART -->|created_at| P2024[(2024 partition)]
PART -->|created_at| P2025[(2025 partition)]
PART -->|created_at| P2026[(2026 partition)]
end
subgraph List_Partitioning
PART -->|country| PUK[(UK partition)]
PART -->|country| PUS[(US partition)]
end
QUERY[SELECT WHERE created_at > 2026] -->|partition pruning| P2026
QUERY -.->|skips| P2024
style TABLE fill:#f85149,color:#fff
style P2026 fill:#238636,color:#fff
style P2024 fill:#6e40c9,color:#fff
style QUERY fill:#1f6feb,color:#fff
Common Misconception
Why It Matters
Common Mistakes
- Partitioning tables that are not actually large enough to benefit — adds complexity with no gain.
- Choosing a partition key that most queries don't filter on — all partitions are scanned anyway.
- Not including the partition key in the primary key — required in most databases for partitioned tables.
- Assuming partitioning replaces indexing — indexes within partitions are still needed for non-partition-key filters.
Code Examples
-- Without partitioning: DELETE of old rows is slow and locks the table:
DELETE FROM events WHERE created_at < '2023-01-01'; -- Deletes millions of rows slowly
-- With range partitioning by month:
-- DROP PARTITION p_2023_01; -- Instant, no row-by-row deletion
-- PostgreSQL range partitioning by date — each partition is a separate table
CREATE TABLE events (
id BIGSERIAL,
created_at TIMESTAMPTZ NOT NULL,
payload JSONB
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2024_q1 PARTITION OF events
FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
CREATE TABLE events_2024_q2 PARTITION OF events
FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');
-- Queries filtered on created_at only scan the relevant partition
-- Old partitions can be detached and archived without touching live data
ALTER TABLE events DETACH PARTITION events_2024_q1;