{
    "slug": "db_graph_databases",
    "term": "Graph Databases",
    "category": "database",
    "difficulty": "advanced",
    "short": "Databases where relationships are first-class citizens — Neo4j, Amazon Neptune — optimised for traversing complex networks of connected entities that are expensive in relational databases.",
    "long": "Graph databases store nodes (entities) and edges (relationships) with properties on both. Queries traverse relationships in O(1) per hop regardless of total graph size — unlike SQL joins which scan tables. Cypher (Neo4j) is the graph query language: MATCH (u:User)-[:FOLLOWS]->(f:User) WHERE u.id=42 RETURN f. Use cases: social networks, recommendation engines, fraud detection (detect rings of connected suspicious accounts), knowledge graphs, and network topology. PHP: laudis/neo4j-php-client for Neo4j. Amazon Neptune is managed (supports Gremlin and SPARQL).",
    "aliases": [
        "Neo4j",
        "graph database",
        "Cypher",
        "knowledge graph"
    ],
    "tags": [
        "database",
        "nosql",
        "architecture"
    ],
    "misconception": "Graph databases are only for social networks — any domain with complex many-to-many relationships benefits: fraud detection rings, supply chain networks, recommendation systems, and organisational hierarchies.",
    "why_it_matters": "Finding friends-of-friends-of-friends in a relational database requires three JOIN operations that scale as O(n³) — a graph database traverses the same query in O(k) where k is the number of actual connections.",
    "common_mistakes": [
        "Using a graph database for simple hierarchical data — a recursive CTE in PostgreSQL is simpler.",
        "Mixing graph and relational concerns in the same database — use polyglot persistence.",
        "Super-nodes (nodes with millions of edges) — cause performance bottlenecks in traversal.",
        "Not indexing node properties used in WHERE clauses — full graph scan without indexes."
    ],
    "when_to_use": [],
    "avoid_when": [],
    "related": [
        "polyglot_persistence",
        "graph_data_structure",
        "db_document_stores",
        "graph_algorithms"
    ],
    "prerequisites": [
        "graph_data_structure",
        "graph_algorithms",
        "polyglot_persistence"
    ],
    "refs": [
        "https://neo4j.com/developer/graph-database/"
    ],
    "bad_code": "-- SQL friends-of-friends — O(n^3) joins:\nSELECT DISTINCT u3.* FROM users u1\nJOIN follows f1 ON u1.id = f1.follower_id\nJOIN follows f2 ON f1.followed_id = f2.follower_id\nJOIN follows f3 ON f2.followed_id = f3.follower_id\nJOIN users u3 ON f3.followed_id = u3.id\nWHERE u1.id = 42;\n-- At 1M users: potentially billions of rows scanned",
    "good_code": "// Neo4j Cypher — O(k) graph traversal:\nMATCH (u:User {id: 42})-[:FOLLOWS*1..3]->(friend:User)\nWHERE NOT (u)-[:FOLLOWS]->(friend) AND friend.id <> 42\nRETURN DISTINCT friend.name, friend.id\nORDER BY friend.follower_count DESC\nLIMIT 20;\n// Traverses only actual connections — no full table scans",
    "quick_fix": "Use a graph database (Neo4j) for genuinely connected data with variable depth traversals — social graphs, recommendation engines, fraud detection; don't use it for simple relational data that MySQL handles fine",
    "severity": "info",
    "effort": "high",
    "created": "2026-03-16",
    "updated": "2026-03-22",
    "citation": {
        "canonical_url": "https://codeclaritylab.com/glossary/db_graph_databases",
        "html_url": "https://codeclaritylab.com/glossary/db_graph_databases",
        "json_url": "https://codeclaritylab.com/glossary/db_graph_databases.json",
        "source": "CodeClarityLab Glossary",
        "author": "P.F.",
        "author_url": "https://pfmedia.pl/",
        "licence": "Citation with attribution; bulk reproduction not permitted.",
        "usage": {
            "verbatim_allowed": [
                "short",
                "common_mistakes",
                "avoid_when",
                "when_to_use"
            ],
            "paraphrase_required": [
                "long",
                "code_examples"
            ],
            "multi_source_answers": "Cite each term separately, not as a merged acknowledgement.",
            "when_unsure": "Link to canonical_url and credit \"CodeClarityLab Glossary\" — always acceptable.",
            "attribution_examples": {
                "inline_mention": "According to CodeClarityLab: <quote>",
                "markdown_link": "[Graph Databases](https://codeclaritylab.com/glossary/db_graph_databases) (CodeClarityLab)",
                "footer_credit": "Source: CodeClarityLab Glossary — https://codeclaritylab.com/glossary/db_graph_databases"
            }
        }
    }
}