Apache Kafka Concepts
TL;DR
Kafka is a distributed log — producers append to immutable, ordered topics partitioned for parallelism, consumers read at their own pace and replay from any offset.
Explanation
Kafka core: Topics divided into Partitions. Producers append records. Consumers maintain their own offset — they can replay from any point. Consumer Groups: each group processes each message once; multiple groups get all messages independently. Offsets committed explicitly — enables at-least-once delivery. Retention: configurable (days, size) — Kafka is a durable log, not just a queue. Key concepts: partition key determines which partition a message lands in (same key = same partition = ordering). Replication factor for fault tolerance. Use cases: event sourcing, audit logs, real-time pipelines, stream processing.
Common Misconception
✗ Kafka is just a faster RabbitMQ — Kafka is a durable distributed log; RabbitMQ is a message broker with routing. Different use cases: Kafka for replay/streaming, RabbitMQ for complex routing.
Why It Matters
Kafka's replay capability enables audit trails, event sourcing, and rebuilding derived data — a fundamentally different model from traditional message queues.
Common Mistakes
- Using Kafka where a simple job queue suffices — operational complexity isn't always justified.
- Choosing the wrong partition key — uneven distribution creates hot partitions.
- Not handling offset commit correctly — duplicate processing or message skipping.
Code Examples
✗ Vulnerable
// Wrong partition key — all messages to one partition:
$producer->produce(0, $message, 'constant-key'); // Hot partition
✓ Fixed
// Partition by user ID — even distribution:
$producer->produce(RD_KAFKA_PARTITION_UA, $message, $userId);
// Manual offset commit for at-least-once:
$consumer->consume(30000);
processMessage($message);
$consumer->commitAsync(); // Commit only after successful processing
References
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
23 Mar 2026
Views
44
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 0
No pings yet today
No pings yesterday
Amazonbot 14
Perplexity 13
Unknown AI 3
Ahrefs 3
ChatGPT 2
Google 2
SEMrush 2
Majestic 1
Also referenced
How they use it
crawler 38
crawler_json 1
pre-tracking 1
Related categories
⚡
DEV INTEL
Tools & Severity
🔵 Info
⚙ Fix effort: High
⚡ Quick Fix
Choose partition key based on desired ordering (userId, orderId). Commit offsets after processing (not before). Set replication-factor ≥ 3 for production.
📦 Applies To
cli
queue-worker
🔗 Prerequisites
🔍 Detection Hints
kafka|rdkafka
Auto-detectable:
✗ No
⚠ Related Problems
🤖 AI Agent
Confidence: Low
False Positives: High
✗ Manual fix
Fix: High
Context: File