← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

Graph Databases

database Advanced
debt(d7/e7/b7/t5)
d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). The detection_hints show automated=no, meaning no tooling automatically flags when graph databases would be appropriate. The code patterns (recursive self-JOINs, multi-table many-to-many queries, variable-depth traversals) are only identifiable through manual code review or performance profiling in production.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix indicates switching to a graph database like Neo4j for connected data. This requires introducing polyglot persistence, new query languages (Cypher), connection libraries (php-neo4j), data migration, and changing all affected query code across the application—a significant architectural change.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (b7). A graph database choice applies to web/cli contexts and has architectural tags. Once adopted, it becomes a load-bearing component: data models, query patterns, and application logic all shape around it. The common_mistakes warn against mixing graph and relational concerns, showing how the choice creates ongoing architectural constraints.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap' (t5). The misconception explicitly states devs believe graph databases are only for social networks, when they apply to fraud detection, supply chains, recommendations, and hierarchies. Common mistakes include using graphs for simple hierarchical data (where PostgreSQL CTEs suffice) and super-node performance issues—documented gotchas that experienced developers eventually learn.

About DEBT scoring →

Also Known As

Neo4j graph database Cypher knowledge graph

TL;DR

Databases where relationships are first-class citizens — Neo4j, Amazon Neptune — optimised for traversing complex networks of connected entities that are expensive in relational databases.

Explanation

Graph databases store nodes (entities) and edges (relationships) with properties on both. Queries traverse relationships in O(1) per hop regardless of total graph size — unlike SQL joins which scan tables. Cypher (Neo4j) is the graph query language: MATCH (u:User)-[:FOLLOWS]->(f:User) WHERE u.id=42 RETURN f. Use cases: social networks, recommendation engines, fraud detection (detect rings of connected suspicious accounts), knowledge graphs, and network topology. PHP: laudis/neo4j-php-client for Neo4j. Amazon Neptune is managed (supports Gremlin and SPARQL).

Common Misconception

Graph databases are only for social networks — any domain with complex many-to-many relationships benefits: fraud detection rings, supply chain networks, recommendation systems, and organisational hierarchies.

Why It Matters

Finding friends-of-friends-of-friends in a relational database requires three JOIN operations that scale as O(n³) — a graph database traverses the same query in O(k) where k is the number of actual connections.

Common Mistakes

  • Using a graph database for simple hierarchical data — a recursive CTE in PostgreSQL is simpler.
  • Mixing graph and relational concerns in the same database — use polyglot persistence.
  • Super-nodes (nodes with millions of edges) — cause performance bottlenecks in traversal.
  • Not indexing node properties used in WHERE clauses — full graph scan without indexes.

Code Examples

✗ Vulnerable
-- SQL friends-of-friends — O(n^3) joins:
SELECT DISTINCT u3.* FROM users u1
JOIN follows f1 ON u1.id = f1.follower_id
JOIN follows f2 ON f1.followed_id = f2.follower_id
JOIN follows f3 ON f2.followed_id = f3.follower_id
JOIN users u3 ON f3.followed_id = u3.id
WHERE u1.id = 42;
-- At 1M users: potentially billions of rows scanned
✓ Fixed
// Neo4j Cypher — O(k) graph traversal:
MATCH (u:User {id: 42})-[:FOLLOWS*1..3]->(friend:User)
WHERE NOT (u)-[:FOLLOWS]->(friend) AND friend.id <> 42
RETURN DISTINCT friend.name, friend.id
ORDER BY friend.follower_count DESC
LIMIT 20;
// Traverses only actual connections — no full table scans

Added 16 Mar 2026
Edited 22 Mar 2026
Views 42
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 0 pings W 0 pings T 0 pings F 2 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 2 pings S 0 pings S 0 pings M 0 pings T 1 ping W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 1 ping W
No pings yesterday
Amazonbot 10 Perplexity 6 Google 4 Ahrefs 3 Unknown AI 3 ChatGPT 2 SEMrush 1
crawler 26 crawler_json 1 pre-tracking 2
DEV INTEL Tools & Severity
🔵 Info ⚙ Fix effort: High
⚡ Quick Fix
Use a graph database (Neo4j) for genuinely connected data with variable depth traversals — social graphs, recommendation engines, fraud detection; don't use it for simple relational data that MySQL handles fine
📦 Applies To
any web cli
🔗 Prerequisites
🔍 Detection Hints
Recursive self-JOIN for hierarchy queries; many-to-many through multiple tables for relationship queries; variable-depth graph traversal in SQL
Auto-detectable: ✗ No neo4j arango php-neo4j
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: High ✗ Manual fix Fix: High Context: File Tests: Update

✓ schema.org compliant