← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

Disjoint Set / Union-Find

Data Structures Advanced
debt(d7/e3/b3/t5)
d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). The term's detection_hints indicate automated detection is 'no', and the code pattern describes recognizing inefficient alternatives (DFS for cycle detection, repeated BFS for components) which requires manual review or performance profiling to identify. No automated tools are specified for detecting missing union-find opportunities.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3). The quick_fix indicates swapping to union-find when detecting components/cycles. Implementing path compression and union by rank is a localized change within the algorithm, not a cross-cutting refactor. Common mistakes like missing path compression are fixable by adding a few lines to the find() method.

b3 Burden Structural debt — long-term weight of choosing wrong

Closest to 'localised tax' (b3). Union-Find is a self-contained data structure used in specific algorithmic contexts (Kruskal's MST, cycle detection). It applies to web/cli contexts but doesn't impose system-wide architectural weight. The choice is confined to the algorithm implementation and doesn't propagate throughout the codebase.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap' (t5). The misconception explicitly states developers assume O(n) per operation, not realizing path compression and union by rank yield amortised O(α(n)). Common mistakes include not implementing these optimizations and misusing for directed graphs. These are documented gotchas that developers eventually learn but aren't immediately obvious.

About DEBT scoring →

Also Known As

Union-Find disjoint set union DSU

TL;DR

A data structure tracking which elements belong to the same group — supporting near-O(1) union and find operations. Used for network connectivity, Kruskal's MST, and cycle detection.

Explanation

Disjoint Set (Union-Find) maintains a forest of trees where each tree represents a set. find(x) returns the root of x's tree (the set representative). union(x, y) merges the two sets. Optimisations: path compression (flatten tree on find), union by rank (always attach smaller tree to larger). Combined: near-O(1) amortised — technically O(α(n)) where α is the inverse Ackermann function, practically constant. Applications: Kruskal's Minimum Spanning Tree, network connectivity (are A and B in the same network?), image segmentation, and cycle detection in undirected graphs.

Common Misconception

Disjoint Set requires O(n) per operation — without optimisations yes; with path compression and union by rank, amortised O(α(n)) which is effectively O(1) for all practical inputs.

Why It Matters

Kruskal's MST algorithm using Union-Find runs in O(E log E) instead of O(E * V) — the difference between seconds and hours for large graphs.

Common Mistakes

  • Not implementing path compression — find() is O(n) without it.
  • Not using union by rank — trees become linked lists without it.
  • Using for directed graphs — Union-Find tracks connected components in undirected graphs only.
  • Forgetting that find() modifies the structure (path compression) — not purely read-only.

Code Examples

✗ Vulnerable
// No optimisations — O(n) find:
class NaiveUnionFind {
    private array $parent;
    public function find(int $x): int {
        while ($this->parent[$x] !== $x) {
            $x = $this->parent[$x]; // Walk up — O(n) in worst case
        }
        return $x;
    }
}
✓ Fixed
// Path compression + union by rank — O(alpha n) amortised:
class UnionFind {
    private array $parent;
    private array $rank;

    public function __construct(int $n) {
        $this->parent = range(0, $n - 1);
        $this->rank   = array_fill(0, $n, 0);
    }

    public function find(int $x): int {
        if ($this->parent[$x] !== $x)
            $this->parent[$x] = $this->find($this->parent[$x]); // Path compression
        return $this->parent[$x];
    }

    public function union(int $x, int $y): void {
        $rx = $this->find($x); $ry = $this->find($y);
        if ($rx === $ry) return;
        if ($this->rank[$rx] < $this->rank[$ry]) [$rx, $ry] = [$ry, $rx];
        $this->parent[$ry] = $rx;
        if ($this->rank[$rx] === $this->rank[$ry]) $this->rank[$rx]++;
    }
}

Added 16 Mar 2026
Edited 22 Mar 2026
Views 46
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 1 ping W 1 ping T 0 pings F 0 pings S 1 ping S 0 pings M 0 pings T 0 pings W 0 pings T 1 ping F 1 ping S 1 ping S 1 ping M 0 pings T 0 pings W 0 pings T 1 ping F 1 ping S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 1 ping S 0 pings S 1 ping M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Amazonbot 9 Perplexity 6 Ahrefs 4 SEMrush 3 Scrapy 3 Google 2 ChatGPT 2 Unknown AI 1 Claude 1 Meta AI 1 Majestic 1 Sogou 1
crawler 32 crawler_json 2
DEV INTEL Tools & Severity
🟢 Low ⚙ Fix effort: Medium
⚡ Quick Fix
Use a disjoint set (union-find) when you need to efficiently determine if two elements are in the same connected component — used in Kruskal's MST and cycle detection
📦 Applies To
any web cli
🔗 Prerequisites
🔍 Detection Hints
Cycle detection in graph using DFS where union-find would be more efficient; connected component check with repeated BFS traversals
Auto-detectable: ✗ No
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: Medium ✗ Manual fix Fix: Medium Context: Class Tests: Update


✓ schema.org compliant