← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

Apache Kafka Concepts

Messaging Intermediate
debt(d9/e7/b7/t7)
d9 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'silent in production until users hit it' (d9). The detection_hints field explicitly states automated: no, and the code_pattern hint (kafka|rdkafka) only identifies Kafka usage, not misuse. Wrong partition keys causing hot partitions, incorrect offset commits causing duplicates or skips — none of these are caught by linters or static analysis; they surface only under production load or through data inconsistency reports.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix describes changes to partition key strategy, offset commit ordering, and replication factor — but partition key changes affect data distribution and consumer topology across the entire system. Fixing a wrong partition key choice means repartitioning topics, migrating consumers, and potentially reprocessing historical data, which is a cross-cutting, multi-service effort. Not quite architectural rework (e9) but well beyond a single component.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (e7). Kafka applies to cli and queue-worker contexts and carries distributed, streaming, and messaging tags. Every downstream service consuming or producing to Kafka must conform to its partitioning, offset management, and replication decisions. The choice of partition key and consumer group strategy shapes how all producers and consumers are written and evolved — a strong gravitational pull that every future change must respect.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap — contradicts how a similar concept works elsewhere' (t7). The misconception explicitly states that developers treat Kafka like a faster RabbitMQ, when it is fundamentally a durable distributed log with replay semantics. This directly contradicts the mental model of traditional message brokers. Common mistakes (wrong partition key, incorrect offset commits) compound this: offset commit behavior in particular inverts expectations (committing before processing feels safe but causes loss; after processing risks duplicates), contradicting intuitions from other queuing systems.

About DEBT scoring →

TL;DR

Kafka is a distributed log — producers append to immutable, ordered topics partitioned for parallelism, consumers read at their own pace and replay from any offset.

Explanation

Kafka core: Topics divided into Partitions. Producers append records. Consumers maintain their own offset — they can replay from any point. Consumer Groups: each group processes each message once; multiple groups get all messages independently. Offsets committed explicitly — enables at-least-once delivery. Retention: configurable (days, size) — Kafka is a durable log, not just a queue. Key concepts: partition key determines which partition a message lands in (same key = same partition = ordering). Replication factor for fault tolerance. Use cases: event sourcing, audit logs, real-time pipelines, stream processing.

Common Misconception

Kafka is just a faster RabbitMQ — Kafka is a durable distributed log; RabbitMQ is a message broker with routing. Different use cases: Kafka for replay/streaming, RabbitMQ for complex routing.

Why It Matters

Kafka's replay capability enables audit trails, event sourcing, and rebuilding derived data — a fundamentally different model from traditional message queues.

Common Mistakes

  • Using Kafka where a simple job queue suffices — operational complexity isn't always justified.
  • Choosing the wrong partition key — uneven distribution creates hot partitions.
  • Not handling offset commit correctly — duplicate processing or message skipping.

Code Examples

✗ Vulnerable
// Wrong partition key — all messages to one partition:
$producer->produce(0, $message, 'constant-key'); // Hot partition
✓ Fixed
// Partition by user ID — even distribution:
$producer->produce(RD_KAFKA_PARTITION_UA, $message, $userId);

// Manual offset commit for at-least-once:
$consumer->consume(30000);
processMessage($message);
$consumer->commitAsync(); // Commit only after successful processing

Added 23 Mar 2026
Views 81
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 1 ping W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 3 pings F 2 pings S 3 pings S 5 pings M 1 ping T 1 ping W 0 pings T 0 pings F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W 2 pings T 0 pings F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Amazonbot 16 Perplexity 13 Scrapy 12 ChatGPT 5 Ahrefs 5 Google 5 SEMrush 4 Unknown AI 3 Majestic 2 Claude 1 Bing 1 Meta AI 1 PetalBot 1
crawler 65 crawler_json 3 pre-tracking 1
DEV INTEL Tools & Severity
🔵 Info ⚙ Fix effort: High
⚡ Quick Fix
Choose partition key based on desired ordering (userId, orderId). Commit offsets after processing (not before). Set replication-factor ≥ 3 for production.
📦 Applies To
cli queue-worker
🔗 Prerequisites
🔍 Detection Hints
kafka|rdkafka
Auto-detectable: ✗ No
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: High ✗ Manual fix Fix: High Context: File


✓ schema.org compliant