Database Sharding
Also Known As
DB sharding
horizontal scaling DB
shard key
TL;DR
Horizontally partitioning data across multiple database instances by a shard key, scaling write throughput beyond a single server's limits.
Explanation
Sharding splits a dataset across multiple independent database nodes (shards) using a partition key (user_id % num_shards, geographic region, etc.). Each shard holds a subset of rows and handles reads/writes for its partition independently. Benefits: linear write scalability, smaller index sizes per shard, geographic distribution. Costs: cross-shard queries and joins become application-level operations, transactions span shard boundaries only with distributed protocols (2PC), rebalancing shards as data grows is operationally complex. For most PHP applications, read replicas, caching, and query optimisation should be exhausted before sharding.
Diagram
flowchart TD
APP[Application] --> ROUTER[Shard Router<br/>which shard?]
ROUTER -->|user_id 0-33pct| S1[(Shard 1<br/>users 0-9999)]
ROUTER -->|user_id 34-66pct| S2[(Shard 2<br/>users 10k-19k)]
ROUTER -->|user_id 67-100pct| S3[(Shard 3<br/>users 20k-30k)]
subgraph Shard_Keys
HASH[Hash sharding<br/>uniform distribution]
RANGE[Range sharding<br/>range queries efficient]
DIR[Directory sharding<br/>flexible lookup]
end
CROSS[Cross-shard queries<br/>expensive - avoid] -.->|design to prevent| ROUTER
style ROUTER fill:#6e40c9,color:#fff
style S1 fill:#1f6feb,color:#fff
style S2 fill:#238636,color:#fff
style S3 fill:#d29922,color:#fff
style CROSS fill:#f85149,color:#fff
Common Misconception
✗ Sharding is the first step when a database becomes slow. Sharding is a last resort — it introduces enormous complexity around cross-shard queries, transactions, and rebalancing. Read replicas, caching, query optimisation, and partitioning should all be exhausted first.
Why It Matters
Sharding distributes data across multiple database servers — each shard holds a subset of data, enabling horizontal scaling beyond what a single server can handle.
Common Mistakes
- Sharding before exhausting vertical scaling and read replicas — premature sharding adds enormous complexity.
- Choosing a shard key that creates hot shards — all traffic to one shard defeats the purpose.
- Cross-shard queries and JOINs — they require scatter-gather operations that negate sharding benefits.
- Not planning for shard rebalancing — adding a new shard requires migrating data, which is operationally painful.
Code Examples
✗ Vulnerable
// Shard key on auto-increment ID — hot shard problem:
// All new writes go to the last shard (highest IDs)
// Shard 1: ids 1-1M (cold), Shard 2: ids 1M-2M (cold), Shard 3: ids 2M+ (all writes)
// Better shard key: hash(user_id) — distributes writes evenly
✓ Fixed
// Horizontal sharding — route by user_id modulo shard count
class ShardRouter {
private array $shards; // array of PDO connections
public function getConnection(int $userId): PDO {
$shardIndex = $userId % count($this->shards);
return $this->shards[$shardIndex];
}
}
// Usage
$pdo = $router->getConnection($userId);
$pdo->prepare('SELECT * FROM orders WHERE user_id = ?')->execute([$userId]);
// Key design decisions:
// - Choose a shard key with high cardinality and even distribution
// - Avoid cross-shard JOINs — denormalise or use an aggregation layer
// - Consistent hashing handles shard rebalancing with minimal data movement
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
15 Mar 2026
Edited
22 Mar 2026
Views
52
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 0
No pings yet today
No pings yesterday
Amazonbot 16
Perplexity 11
Google 4
Ahrefs 3
SEMrush 3
Unknown AI 2
Meta AI 1
DuckDuckGo 1
Also referenced
How they use it
crawler 41
Related categories
⚡
DEV INTEL
Tools & Severity
🔵 Info
⚙ Fix effort: High
⚡ Quick Fix
Exhaust vertical scaling and read replicas before sharding — sharding adds enormous complexity; when you do shard, choose a key that distributes writes evenly
📦 Applies To
PHP 5.0+
web
cli
🔗 Prerequisites
🔍 Detection Hints
Single database write bottleneck after exhausting indexing caching replicas and vertical scaling
Auto-detectable:
✗ No
vitess
planetscale
⚠ Related Problems
🤖 AI Agent
Confidence: Low
False Positives: Medium
✗ Manual fix
Fix: High
Context: File
Tests: Update