← Back to glossary

Consistent Hashing

architecture Advanced

debt(d7/e5/b7/t5)

d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). Consistent hashing misconfiguration (too few vnodes, poor hash function, ring wrap-around bugs) won't be caught by compilers or standard linters. Issues manifest as uneven load distribution or hotspots that only appear under production traffic patterns or through careful performance testing and metrics analysis.

e5 Effort Remediation debt — work required to fix once spotted

Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix advises using an existing library like flexihash or Redis Cluster rather than rolling your own. Replacing a custom implementation with a battle-tested library requires updating the hashing logic, potentially changing how keys are distributed, and verifying cache behavior—a significant refactor within the caching/distribution layer but not necessarily cross-cutting.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (b7). Consistent hashing is a foundational architectural choice for distributed caching and sharding. Once adopted, the ring structure, vnode configuration, and hash function choice influence how data is partitioned across the entire system. Adding/removing nodes, debugging hotspots, and capacity planning all revolve around this choice. It applies to web and CLI contexts and shapes distributed system design significantly.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap' (t5). The misconception states that consistent hashing guarantees perfectly even load distribution, when in reality basic implementations produce hotspots with small server counts. This is a documented gotcha that developers eventually learn—vnodes (100-200 per server) are required for balanced distribution, which isn't obvious from the concept's name or initial understanding.

About DEBT scoring → scored by claude-opus-4-5-20251101 · 2026-05-06 · reviewed by human

Also Known As

consistent hash hash ring distributed hashing ketama

TL;DR

A hashing technique used in distributed systems where adding or removing a node rebalances only a fraction of keys rather than remapping everything — essential for distributed caches, load balancers, and sharded databases.

Explanation

In a naive hash-based distribution, mapping a key to a server uses key % N where N is the server count. Adding or removing a server changes N, invalidating almost every mapping and causing a cache stampede or data migration. Consistent hashing places both servers and keys on a circular ring (hash space 0 to 2³²). Each key maps to the first server clockwise on the ring. When a server is added, it takes over only the keys between itself and the previous server on the ring — on average 1/N of all keys. Virtual nodes (vnodes) assign multiple ring positions to each server, improving load balance when server counts are small.

Common Misconception

✗ Consistent hashing guarantees perfectly even load distribution. Basic consistent hashing can produce hotspots when server counts are small. Virtual nodes (vnodes) — assigning each physical server multiple positions on the ring — are required for balanced distribution in practice.

Why It Matters

Without consistent hashing, scaling a distributed cache means re-mapping every key and accepting a complete cache miss storm during the transition. With consistent hashing, adding a cache node only invalidates the keys it takes over — roughly 1/N of the total. Redis Cluster, Cassandra, Memcached (ketama), and many CDNs use consistent hashing as a fundamental building block.

Common Mistakes

Using too few virtual nodes — with 3 physical servers and 1 vnode each, load distribution is wildly uneven; 100–200 vnodes per server is typical.
Not accounting for ring wrap-around — the ring is circular; keys with hashes larger than any node's hash map to the first node on the ring.
Choosing a poor hash function — CRC32 is fast but has collisions; MD5 or MurmurHash3 provide better distribution for the ring.
Implementing consistent hashing yourself for production use — use a battle-tested library; Redis Cluster handles this automatically.

Code Examples

✗ Vulnerable

<?php
// ❌ Naive modulo sharding — adding a server remaps almost everything
function getServer(string $key, array $servers): string
{
    return $servers[crc32($key) % count($servers)];
    // If servers goes from 3 to 4: ~75% of keys change servers
    // Result: cache miss storm, data rebalancing needed
}

✓ Fixed

<?php
// ✅ Consistent hash ring — only ~1/N keys move when a server is added
class ConsistentHashRing
{
    private array $ring = [];
    private array $sortedKeys = [];

    public function addNode(string $node, int $vnodes = 150): void
    {
        for ($i = 0; $i < $vnodes; $i++) {
            $hash = crc32($node . ':' . $i);
            $this->ring[$hash] = $node;
        }
        ksort($this->ring);
        $this->sortedKeys = array_keys($this->ring);
    }

    public function getNode(string $key): string
    {
        $hash = crc32($key);
        foreach ($this->sortedKeys as $ringKey) {
            if ($hash <= $ringKey) return $this->ring[$ringKey];
        }
        return $this->ring[$this->sortedKeys[0]]; // Wrap around
    }
}

Consistent Hashing

Also Known As

TL;DR

Explanation

Common Misconception

Why It Matters

Common Mistakes

Code Examples

References

Tags

Consistent Hashing

Also Known As

TL;DR

Explanation

Common Misconception

Why It Matters

Common Mistakes

Code Examples

References

Tags

Related Terms