Consistent Hashing
debt(d7/e5/b7/t5)
Closest to 'only careful code review or runtime testing' (d7). Consistent hashing misconfiguration (too few vnodes, poor hash function, ring wrap-around bugs) won't be caught by compilers or standard linters. Issues manifest as uneven load distribution or hotspots that only appear under production traffic patterns or through careful performance testing and metrics analysis.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix advises using an existing library like flexihash or Redis Cluster rather than rolling your own. Replacing a custom implementation with a battle-tested library requires updating the hashing logic, potentially changing how keys are distributed, and verifying cache behavior—a significant refactor within the caching/distribution layer but not necessarily cross-cutting.
Closest to 'strong gravitational pull' (b7). Consistent hashing is a foundational architectural choice for distributed caching and sharding. Once adopted, the ring structure, vnode configuration, and hash function choice influence how data is partitioned across the entire system. Adding/removing nodes, debugging hotspots, and capacity planning all revolve around this choice. It applies to web and CLI contexts and shapes distributed system design significantly.
Closest to 'notable trap' (t5). The misconception states that consistent hashing guarantees perfectly even load distribution, when in reality basic implementations produce hotspots with small server counts. This is a documented gotcha that developers eventually learn—vnodes (100-200 per server) are required for balanced distribution, which isn't obvious from the concept's name or initial understanding.
Also Known As
TL;DR
Explanation
In a naive hash-based distribution, mapping a key to a server uses key % N where N is the server count. Adding or removing a server changes N, invalidating almost every mapping and causing a cache stampede or data migration. Consistent hashing places both servers and keys on a circular ring (hash space 0 to 2³²). Each key maps to the first server clockwise on the ring. When a server is added, it takes over only the keys between itself and the previous server on the ring — on average 1/N of all keys. Virtual nodes (vnodes) assign multiple ring positions to each server, improving load balance when server counts are small.
Common Misconception
Why It Matters
Common Mistakes
- Using too few virtual nodes — with 3 physical servers and 1 vnode each, load distribution is wildly uneven; 100–200 vnodes per server is typical.
- Not accounting for ring wrap-around — the ring is circular; keys with hashes larger than any node's hash map to the first node on the ring.
- Choosing a poor hash function — CRC32 is fast but has collisions; MD5 or MurmurHash3 provide better distribution for the ring.
- Implementing consistent hashing yourself for production use — use a battle-tested library; Redis Cluster handles this automatically.
Code Examples
<?php
// ❌ Naive modulo sharding — adding a server remaps almost everything
function getServer(string $key, array $servers): string
{
return $servers[crc32($key) % count($servers)];
// If servers goes from 3 to 4: ~75% of keys change servers
// Result: cache miss storm, data rebalancing needed
}
<?php
// ✅ Consistent hash ring — only ~1/N keys move when a server is added
class ConsistentHashRing
{
private array $ring = [];
private array $sortedKeys = [];
public function addNode(string $node, int $vnodes = 150): void
{
for ($i = 0; $i < $vnodes; $i++) {
$hash = crc32($node . ':' . $i);
$this->ring[$hash] = $node;
}
ksort($this->ring);
$this->sortedKeys = array_keys($this->ring);
}
public function getNode(string $key): string
{
$hash = crc32($key);
foreach ($this->sortedKeys as $ringKey) {
if ($hash <= $ringKey) return $this->ring[$ringKey];
}
return $this->ring[$this->sortedKeys[0]]; // Wrap around
}
}