How typed relationships between terms in this glossary get curated. LLM-proposed, gate-filtered, human-reviewed. Every edge earns its place.
Glossaries usually express relationships as flat lists — "related terms" with
no labels on the links. This one doesn't. Every relationship between two terms here
is tagged with a verb from a small vocabulary of 18, so a
connection isn't just "these two go together" — it's X requires Y, or
X mitigates Y, or X is an alternative to Y.
That shifts what the glossary can do. The reader sees not only adjacent concepts but why they're adjacent. The structure becomes machine-readable, so downstream tools (Codex, future agents) can traverse it. And typed relationships fail loudly when wrong, where untyped lists fail silently — discipline forces clarity.
Five stages, from a Claude proposal to a public typed relationship on the term page:
related[] entries
plus its first 10 alphabetical same-category siblings as the candidate set,
then asks Claude to propose typed relationships from the 18-verb vocabulary
with a direction-aware prompt.
The pipeline refuses to auto-approve anything. Every edge that reaches the public side has been seen by a human. Every edge that doesn't reach the public side is logged so it won't be re-proposed. The rejected log is, in effect, a record of what's not a relationship — equally valuable, equally permanent.
The vocabulary is intentionally small. Seven verbs covered the original launch;
three more were added in May 2026 when rejection-pattern data showed clear gaps
— the LLM kept reaching for mitigates when the relationship was
really analytical (a metric flags a smell, not prevents it), or for
contains when one term was the implementation of another. Those three
were implements, configures, detects.
Two more followed later in May 2026, again driven by the rejected log rather than
convenience. Edges tagged [vocab-gap] clustered around two relationships
none of the existing verbs could state honestly: a metric quantifying a
property of its subject (an evaluation score, not a flag — distinct from
detects), and one term being an input feeding another's
computation (not a runtime dependency, not membership). Those gaps became
measures and feeds.
Through mid-2026 the same evidence-driven process added six more, each from a
recurring [vocab-gap] or [verb-too-strong] cluster the
existing verbs couldn't state honestly: specializes (a subtype /
is-a-kind-of, neither containment nor contract), enforces (an active
runtime or analyzer that checks or guarantees a rule), enables (a
permissive condition that makes an outcome possible without directly causing it),
realizes (a concrete mechanism providing a working version of an
abstract protocol or pattern), applies (a technique putting a non-code
principle or guideline into practice), and leverages (a feature built
on another construct as a working ingredient). Every other [verb-too-strong]
rejection turned out to be the LLM mis-picking an existing verb, not a missing one —
so each addition had to be earned, and the prompt was tightened the rest of the time.
The vocabulary now stands at 18; adding another
holds to the same standard: evidence of a recurring gap in the rejected log, not just
plausibility.
Most glossaries are read once and forgotten. This one is built to be read by humans and by other tools — the codebase, search agents, future Claude tooling. Typed edges are the substrate that lets a glossary become something more than a lookup table: a navigable knowledge graph where relationships are explicit, falsifiable, and inheritable.
The curation pipeline above is slow on purpose. Auto-approving LLM proposals would let the graph grow fast and accumulate quiet errors. Putting a human in the loop on every edge — and giving the human a second-opinion LLM to argue with — keeps the graph small but trustworthy. At small N, that matters more than scale.
As the graph grows, patterns emerge. Overused verbs flag underspecified concepts.
Underconnected categories flag gaps in coverage. Contradiction sweeps catch
mutually-incompatible edges. None of that is possible with a flat related[]
list — but all of it becomes routine once the relationships are typed.