← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Database Sharding

performance PHP 5.0+ Advanced

Also Known As

DB sharding horizontal scaling DB shard key

TL;DR

Horizontally partitioning data across multiple database instances by a shard key, scaling write throughput beyond a single server's limits.

Explanation

Sharding splits a dataset across multiple independent database nodes (shards) using a partition key (user_id % num_shards, geographic region, etc.). Each shard holds a subset of rows and handles reads/writes for its partition independently. Benefits: linear write scalability, smaller index sizes per shard, geographic distribution. Costs: cross-shard queries and joins become application-level operations, transactions span shard boundaries only with distributed protocols (2PC), rebalancing shards as data grows is operationally complex. For most PHP applications, read replicas, caching, and query optimisation should be exhausted before sharding.

Diagram

flowchart TD
    APP[Application] --> ROUTER[Shard Router<br/>which shard?]
    ROUTER -->|user_id 0-33pct| S1[(Shard 1<br/>users 0-9999)]
    ROUTER -->|user_id 34-66pct| S2[(Shard 2<br/>users 10k-19k)]
    ROUTER -->|user_id 67-100pct| S3[(Shard 3<br/>users 20k-30k)]
    subgraph Shard_Keys
        HASH[Hash sharding<br/>uniform distribution]
        RANGE[Range sharding<br/>range queries efficient]
        DIR[Directory sharding<br/>flexible lookup]
    end
    CROSS[Cross-shard queries<br/>expensive - avoid] -.->|design to prevent| ROUTER
style ROUTER fill:#6e40c9,color:#fff
style S1 fill:#1f6feb,color:#fff
style S2 fill:#238636,color:#fff
style S3 fill:#d29922,color:#fff
style CROSS fill:#f85149,color:#fff

Common Misconception

Sharding is the first step when a database becomes slow. Sharding is a last resort — it introduces enormous complexity around cross-shard queries, transactions, and rebalancing. Read replicas, caching, query optimisation, and partitioning should all be exhausted first.

Why It Matters

Sharding distributes data across multiple database servers — each shard holds a subset of data, enabling horizontal scaling beyond what a single server can handle.

Common Mistakes

  • Sharding before exhausting vertical scaling and read replicas — premature sharding adds enormous complexity.
  • Choosing a shard key that creates hot shards — all traffic to one shard defeats the purpose.
  • Cross-shard queries and JOINs — they require scatter-gather operations that negate sharding benefits.
  • Not planning for shard rebalancing — adding a new shard requires migrating data, which is operationally painful.

Code Examples

✗ Vulnerable
// Shard key on auto-increment ID — hot shard problem:
// All new writes go to the last shard (highest IDs)
// Shard 1: ids 1-1M (cold), Shard 2: ids 1M-2M (cold), Shard 3: ids 2M+ (all writes)
// Better shard key: hash(user_id) — distributes writes evenly
✓ Fixed
// Horizontal sharding — route by user_id modulo shard count
class ShardRouter {
    private array $shards; // array of PDO connections

    public function getConnection(int $userId): PDO {
        $shardIndex = $userId % count($this->shards);
        return $this->shards[$shardIndex];
    }
}

// Usage
$pdo = $router->getConnection($userId);
$pdo->prepare('SELECT * FROM orders WHERE user_id = ?')->execute([$userId]);

// Key design decisions:
// - Choose a shard key with high cardinality and even distribution
// - Avoid cross-shard JOINs — denormalise or use an aggregation layer
// - Consistent hashing handles shard rebalancing with minimal data movement

Added 15 Mar 2026
Edited 22 Mar 2026
Views 52
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 1 ping S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T 2 pings F 4 pings S 0 pings S 0 pings M 0 pings T 1 ping W 0 pings T 1 ping F 1 ping S 0 pings S 1 ping M 0 pings T 0 pings W 0 pings T
No pings yet today
No pings yesterday
Amazonbot 16 Perplexity 11 Google 4 Ahrefs 3 SEMrush 3 Unknown AI 2 Meta AI 1 DuckDuckGo 1
crawler 41
DEV INTEL Tools & Severity
🔵 Info ⚙ Fix effort: High
⚡ Quick Fix
Exhaust vertical scaling and read replicas before sharding — sharding adds enormous complexity; when you do shard, choose a key that distributes writes evenly
📦 Applies To
PHP 5.0+ web cli
🔗 Prerequisites
🔍 Detection Hints
Single database write bottleneck after exhausting indexing caching replicas and vertical scaling
Auto-detectable: ✗ No vitess planetscale
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: Medium ✗ Manual fix Fix: High Context: File Tests: Update

✓ schema.org compliant