Data Mesh Architecture
debt(d9/e9/b9/t7)
Closest to 'silent in production until users hit it' (d9). The term's detection_hints explicitly state automated=no. The code_pattern describes symptoms (central team bottleneck, shared DB access patterns) that manifest as organizational pain over months or years, not as tooling alerts. No static analysis can detect 'you're doing Data Mesh wrong' — it only surfaces when analytics requests pile up or compliance audits fail.
Closest to 'architectural rework' (e9). While quick_fix suggests starting with data contracts, the full remediation requires organizational restructuring: shifting ownership from central data teams to domain teams, building self-serve platforms, establishing governance. The common_mistakes show that technology changes alone provide no benefit — the fix is fundamentally about rewriting organizational structure and data architecture simultaneously.
Closest to 'defines the system's shape' (b9). Data Mesh applies across web and cli contexts and is tagged as architecture/organisational. Once adopted, it determines how every team interacts with data, who owns what, and how analytics are delivered. The common_mistakes around 'no global governance' and 'no self-serve platform' show the all-encompassing nature — every future data decision is shaped by this paradigm choice.
Closest to 'serious trap' (t7). The misconception field explicitly states developers believe 'Data Mesh is a technology' when it's actually an organisational paradigm. This contradicts how similar architectural patterns (microservices, event sourcing) work — those are primarily technical choices. A competent developer expecting to implement Data Mesh by deploying tools will miss the fundamental organizational change required, as evidenced by the first common_mistake about technology changes without domain ownership.
Also Known As
TL;DR
Explanation
Data Mesh (Zhamak Dehghani, 2019) has four principles: (1) Domain ownership — the team producing data owns its quality and accessibility. (2) Data as a product — domains publish with SLOs, documentation, and discovery. (3) Self-serve infrastructure — a platform team provides tooling so domain teams can publish and consume data without central data engineering. (4) Federated computational governance — global policies (privacy, security, format standards) applied consistently. Contrast with centralised data lake: one engineering team becomes the bottleneck for every analytics use case.
Common Misconception
Why It Matters
Common Mistakes
- Technology changes without domain ownership — no organisational change means no Data Mesh benefit
- No global governance — federated domains without common policies create compliance chaos
- No self-serve platform — domain teams cannot own data if they must build all infrastructure themselves
- Trying to centralise all data first then redistribute — defeats domain ownership
Code Examples
// Centralised data lake — bottleneck:
// Marketing needs order data for cohort analysis
// → Ticket to central data engineering team
// → 6-week backlog
// → Data engineer unfamiliar with order domain
// → Schema misunderstood → wrong analysis for 3 more weeks
// Total: 9 weeks for one analysis
// Data Mesh — domain ownership:
// Orders team publishes: orders-domain-data-product
// SLO: updated hourly, 99.9% availability, schema versioned
// Listed in company data catalogue
// Marketing team:
// 1. Discovers via data catalogue
// 2. Subscribes via self-serve platform
// 3. Queries directly: 1 day to first analysis
// Orders team: owns data quality — no central bottleneck