Topological Sort
debt(d9/e5/b5/t7)
Closest to 'silent in production until users hit it' (d9). The detection_hints field explicitly states automated: no, and the code_pattern describes nested loops and repeated array_diff/in_array — patterns that produce no compiler or linter warnings. A cycle in the dependency graph silently returns a partial ordering, and this only manifests when users encounter broken dependency resolution at runtime. No tools from detection_hints.tools (list is empty) can catch this automatically.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix says to use Kahn's algorithm with an in-degree map and queue, but this isn't a one-line swap — it requires replacing ad-hoc nested loop logic, adding cycle detection, and ensuring callers handle the error/exception when a cycle is found. This likely touches the dependency resolver component and its consumers, making it a meaningful refactor within one component or across a few related ones.
Closest to 'persistent productivity tax' (b5). The applies_to covers both web and cli contexts broadly. Any system doing dependency resolution, build ordering, or migration ordering will have this algorithm at its core. A naive or incorrect implementation (missing cycle detection, non-linear complexity) imposes an ongoing tax: tests can't rely on output order, edge cases around cycles require workarounds, and performance degrades on large graphs — slowing down multiple work streams without fully defining the system's architecture.
Closest to 'serious trap (contradicts how a similar concept works elsewhere)' (t7). The misconception field states that developers assume topological sort works on any directed graph, when it only works on DAGs. This contradicts the intuition from general sorting (which always produces a result) and from directed-graph traversal (which succeeds on cyclic graphs). Additionally, common_mistakes highlight that silent partial ordering on cycles, assuming unique output, and using it on undirected graphs are all serious pitfalls — multiple traps that competent developers regularly fall into.
Also Known As
TL;DR
Explanation
Topological sort is only possible on directed acyclic graphs (DAGs) — any cycle makes a valid ordering impossible. Two common algorithms: Kahn's algorithm uses in-degree counts and a queue (BFS-based), and DFS-based sort uses recursive post-order traversal. Both run in O(V + E). The result is not unique — many valid orderings may exist. Real-world applications are everywhere: package/dependency managers (Composer, npm) use it to determine installation order, build systems (Make, Gradle) use it for task scheduling, course prerequisite systems, and database migration runners rely on it to apply schema changes in the right order.
Diagram
flowchart LR
subgraph DAG Input
A --> B
A --> C
B --> D
C --> D
D --> E
end
subgraph Kahn Step-by-Step
K1[Start: A has in-degree 0] --> K2[Process A<br/>reduce B and C]
K2 --> K3[Process B and C<br/>reduce D]
K3 --> K4[Process D<br/>reduce E]
K4 --> K5[Process E<br/>done]
end
RESULT[Order: A - B - C - D - E]
style RESULT fill:#238636,color:#fff
style K1 fill:#1f6feb,color:#fff
Common Misconception
Why It Matters
Common Mistakes
- Not detecting cycles — silently returning a partial ordering when a cycle exists instead of throwing an error.
- Using topological sort on an undirected graph — the concept only applies to directed edges.
- Assuming the output order is unique — many valid orderings exist; tests should not depend on a specific one.
- Reinventing it manually with nested loops instead of using Kahn's or DFS — O(V²) instead of O(V+E).
Code Examples
// Naive: repeatedly scan for nodes with no unresolved deps — O(V²):
function topoNaive(array $deps): array {
$resolved = [];
while (count($resolved) < count($deps)) {
$progress = false;
foreach ($deps as $node => $nodeDeps) {
if (!in_array($node, $resolved) &&
empty(array_diff($nodeDeps, $resolved))) {
$resolved[] = $node;
$progress = true;
}
}
if (!$progress) throw new RuntimeException('Cycle detected');
}
return $resolved;
}
// Kahn's algorithm — O(V+E) with cycle detection:
function topologicalSort(array $graph): array {
$inDegree = array_fill_keys(array_keys($graph), 0);
foreach ($graph as $neighbours) {
foreach ($neighbours as $v) $inDegree[$v]++;
}
$queue = array_keys(array_filter($inDegree, fn($d) => $d === 0));
$order = [];
while ($queue) {
$u = array_shift($queue);
$order[] = $u;
foreach ($graph[$u] as $v) {
if (--$inDegree[$v] === 0) $queue[] = $v;
}
}
if (count($order) !== count($graph)) {
throw new RuntimeException('Graph has a cycle — topological sort impossible');
}
return $order;
}
// Usage: ['a' => ['b','c'], 'b' => ['d'], 'c' => ['d'], 'd' => []]
// Returns: ['a', 'b', 'c', 'd'] or ['a', 'c', 'b', 'd']