{
    "slug": "consistent_hashing",
    "term": "Consistent Hashing",
    "category": "architecture",
    "difficulty": "advanced",
    "short": "A hashing technique used in distributed systems where adding or removing a node rebalances only a fraction of keys rather than remapping everything — essential for distributed caches, load balancers, and sharded databases.",
    "long": "In a naive hash-based distribution, mapping a key to a server uses key % N where N is the server count. Adding or removing a server changes N, invalidating almost every mapping and causing a cache stampede or data migration. Consistent hashing places both servers and keys on a circular ring (hash space 0 to 2³²). Each key maps to the first server clockwise on the ring. When a server is added, it takes over only the keys between itself and the previous server on the ring — on average 1/N of all keys. Virtual nodes (vnodes) assign multiple ring positions to each server, improving load balance when server counts are small.",
    "aliases": [
        "consistent hash",
        "hash ring",
        "distributed hashing",
        "ketama"
    ],
    "tags": [
        "distributed-systems",
        "caching",
        "sharding",
        "load-balancing"
    ],
    "misconception": "Consistent hashing guarantees perfectly even load distribution. Basic consistent hashing can produce hotspots when server counts are small. Virtual nodes (vnodes) — assigning each physical server multiple positions on the ring — are required for balanced distribution in practice.",
    "why_it_matters": "Without consistent hashing, scaling a distributed cache means re-mapping every key and accepting a complete cache miss storm during the transition. With consistent hashing, adding a cache node only invalidates the keys it takes over — roughly 1/N of the total. Redis Cluster, Cassandra, Memcached (ketama), and many CDNs use consistent hashing as a fundamental building block.",
    "common_mistakes": [
        "Using too few virtual nodes — with 3 physical servers and 1 vnode each, load distribution is wildly uneven; 100–200 vnodes per server is typical.",
        "Not accounting for ring wrap-around — the ring is circular; keys with hashes larger than any node's hash map to the first node on the ring.",
        "Choosing a poor hash function — CRC32 is fast but has collisions; MD5 or MurmurHash3 provide better distribution for the ring.",
        "Implementing consistent hashing yourself for production use — use a battle-tested library; Redis Cluster handles this automatically."
    ],
    "when_to_use": [],
    "avoid_when": [],
    "related": [
        "database_sharding",
        "caching_strategies",
        "redis",
        "cap_theorem",
        "load_balancing"
    ],
    "prerequisites": [],
    "refs": [
        "https://en.wikipedia.org/wiki/Consistent_hashing",
        "https://redis.io/docs/reference/cluster-spec/"
    ],
    "bad_code": "<?php\n// ❌ Naive modulo sharding — adding a server remaps almost everything\nfunction getServer(string $key, array $servers): string\n{\n    return $servers[crc32($key) % count($servers)];\n    // If servers goes from 3 to 4: ~75% of keys change servers\n    // Result: cache miss storm, data rebalancing needed\n}",
    "good_code": "<?php\n// ✅ Consistent hash ring — only ~1/N keys move when a server is added\nclass ConsistentHashRing\n{\n    private array $ring = [];\n    private array $sortedKeys = [];\n\n    public function addNode(string $node, int $vnodes = 150): void\n    {\n        for ($i = 0; $i < $vnodes; $i++) {\n            $hash = crc32($node . ':' . $i);\n            $this->ring[$hash] = $node;\n        }\n        ksort($this->ring);\n        $this->sortedKeys = array_keys($this->ring);\n    }\n\n    public function getNode(string $key): string\n    {\n        $hash = crc32($key);\n        foreach ($this->sortedKeys as $ringKey) {\n            if ($hash <= $ringKey) return $this->ring[$ringKey];\n        }\n        return $this->ring[$this->sortedKeys[0]]; // Wrap around\n    }\n}",
    "quick_fix": "Use an existing consistent hashing library rather than implementing the ring yourself — subtle bugs in ring arithmetic cause hard-to-diagnose hotspots. In PHP, use a Redis Cluster or a library like flexihash.",
    "effort": "high",
    "created": "2026-03-23",
    "updated": "2026-03-23",
    "citation": {
        "canonical_url": "https://codeclaritylab.com/glossary/consistent_hashing",
        "html_url": "https://codeclaritylab.com/glossary/consistent_hashing",
        "json_url": "https://codeclaritylab.com/glossary/consistent_hashing.json",
        "source": "CodeClarityLab Glossary",
        "author": "P.F.",
        "author_url": "https://pfmedia.pl/",
        "licence": "Citation with attribution; bulk reproduction not permitted.",
        "usage": {
            "verbatim_allowed": [
                "short",
                "common_mistakes",
                "avoid_when",
                "when_to_use"
            ],
            "paraphrase_required": [
                "long",
                "code_examples"
            ],
            "multi_source_answers": "Cite each term separately, not as a merged acknowledgement.",
            "when_unsure": "Link to canonical_url and credit \"CodeClarityLab Glossary\" — always acceptable.",
            "attribution_examples": {
                "inline_mention": "According to CodeClarityLab: <quote>",
                "markdown_link": "[Consistent Hashing](https://codeclaritylab.com/glossary/consistent_hashing) (CodeClarityLab)",
                "footer_credit": "Source: CodeClarityLab Glossary — https://codeclaritylab.com/glossary/consistent_hashing"
            }
        }
    }
}