← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Change Data Capture (CDC)

database Advanced
debt(d7/e7/b7/t5)
d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). The detection_hints.tools list is empty, so no automated tooling is specified. Misconfigurations (wrong WAL level, missing binlog format, consumer lag) are silent until runtime — the application continues operating normally while downstream systems silently receive stale or missing data. This is not caught by compilers, linters, or standard SAST tools, and typically surfaces only through monitoring or user-reported inconsistencies.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix suggests falling back to the outbox pattern for simpler cases, but that itself requires schema changes and transaction-level coordination. Full CDC infrastructure (Debezium, Kafka, WAL configuration, schema evolution handling) spans multiple systems — database config, broker setup, consumer services, and schema migration coordination. Remediating common mistakes like consumer lag or schema breakage requires coordinated changes across multiple components and teams.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (b7). CDC is an architectural pattern that, once adopted, shapes how all downstream data flows are designed — every schema migration must coordinate with CDC consumers, every new service must decide whether to consume the CDC stream, and operational concerns (consumer lag, replication slots, log retention) become persistent productivity taxes across the whole system. The tags (kafka, streaming, event-driven, replication) confirm cross-cutting architectural reach.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap (a documented gotcha most devs eventually learn)' (t5). The canonical misconception is that CDC requires application or schema changes — in fact it reads the transaction log transparently. While this is a genuine surprise, it is a positive one (less invasive than expected). The more dangerous traps are the common_mistakes: needing specific WAL/binlog configuration before deploying, schema changes breaking consumers, and consumer lag causing silent staleness — these are well-documented but non-obvious, placing this at t5.

About DEBT scoring →

Also Known As

CDC Debezium database streaming binlog outbox pattern alternative

TL;DR

A pattern for tracking and streaming every insert, update, and delete from a database — by reading the database's internal transaction log rather than polling tables — enabling real-time event-driven integrations without impacting query performance.

Explanation

CDC reads the database's write-ahead log (WAL in PostgreSQL, binlog in MySQL) to capture every change as a structured event without adding triggers or polling queries. Tools like Debezium consume these logs and publish change events to Kafka or other message queues. Downstream consumers — search indexes, caches, analytics, microservices — receive changes in near real-time. CDC solves the 'dual write' problem: instead of writing to both a database and a message queue transactionally (hard), you write only to the database and let CDC propagate changes. This guarantees consistency between the database and downstream systems.

Common Misconception

CDC requires changes to the application code or database schema. It does not — CDC reads the transaction log, which the database writes regardless. Existing applications see no change; CDC is entirely transparent to them.

Why It Matters

CDC enables architectural patterns that polling cannot: real-time cache invalidation (when a product changes in MySQL, invalidate Redis immediately), search index updates without database triggers, audit logs without application-level logging, and cross-service event propagation without dual writes. The outbox pattern is an alternative when full CDC infrastructure is too heavy.

Common Mistakes

  • Polling for changes with 'SELECT * WHERE updated_at > last_check' — misses deletes, requires an index on updated_at, and has a race window for concurrent updates.
  • Not configuring WAL level correctly for CDC — PostgreSQL requires wal_level=logical; MySQL requires binlog_format=ROW; check before deploying Debezium.
  • Ignoring consumer lag — CDC events queue up in Kafka; if consumers fall behind, downstream systems see stale data for hours.
  • Not handling schema changes — adding or renaming a column breaks CDC consumers; coordinate schema migrations with consumer updates.

Code Examples

✗ Vulnerable
<?php
// ❌ Dual write — database update + cache/search update without atomicity
public function updateProduct(int $id, array $data): void
{
    $this->db->update('products', $data, ['id' => $id]);
    $this->redis->del("product:$id");     // What if this fails?
    $this->elasticsearch->index($data);   // Database updated, search not
    $this->eventBus->publish('product.updated', $data); // Partial state
    // Any failure here leaves systems inconsistent
}
✓ Fixed
<?php
// ✅ Outbox pattern — write event to DB in same transaction
public function updateProduct(int $id, array $data): void
{
    $this->db->beginTransaction();
    try {
        $this->db->update('products', $data, ['id' => $id]);
        // Event stored atomically with the business data change
        $this->db->insert('outbox_events', [
            'type'       => 'product.updated',
            'payload'    => json_encode(['id' => $id, ...$data]),
            'created_at' => date('Y-m-d H:i:s'),
        ]);
        $this->db->commit();
    } catch (Throwable $e) {
        $this->db->rollBack();
        throw $e;
    }
    // Separate worker reads outbox and publishes events
    // CDC tool (Debezium) reads the outbox table change via WAL
}

Added 23 Mar 2026
Views 32
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings F 0 pings S 0 pings S 1 ping M 1 ping T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 1 ping W 0 pings T 1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T 1 ping F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W 0 pings T 2 pings F 0 pings S
No pings yet today
Perplexity 1 Amazonbot 1
Perplexity 8 Amazonbot 6 SEMrush 3 Ahrefs 2 ChatGPT 1 Google 1 Bing 1
crawler 22
DEV INTEL Tools & Severity
⚙ Fix effort: High
⚡ Quick Fix
For simple CDC in PHP, use the outbox pattern (write events to a database table in the same transaction) rather than full CDC infrastructure — it provides similar guarantees with less operational complexity.

✓ schema.org compliant