Queue-Based Load Levelling
debt(d7/e7/b7/t5)
Closest to 'only careful code review or runtime testing' (d7). The absence of a queue between mismatched-throughput systems isn't caught by static tools; detection_hints lists Horizon/Datadog/CloudWatch which surface symptoms (queue depth, latency) only at runtime under load.
Closest to 'cross-cutting refactor across the codebase' (e7). Quick_fix says 'put a queue between two systems' — sounds simple, but introducing async processing means moving the work into jobs, handling retries/DLQ, updating callers to not wait, and adapting UX flows; touches many call sites.
Closest to 'strong gravitational pull' (b7). Applies to web and queue-worker contexts; once introduced, every related feature must consider sync-vs-async, idempotency, and queue depth. Shapes how new work is dispatched throughout the system.
Closest to 'notable trap most devs eventually learn' (t5). Misconception flagged: developers see added latency and dismiss the pattern, missing that the benefit is survival under spike rather than speed — a common gotcha but not catastrophically counterintuitive.
Also Known As
TL;DR
Explanation
Load levelling decouples production rate from consumption rate. A traffic spike hits the queue (which can absorb millions of messages) rather than directly hitting the database or downstream service. The queue acts as a buffer — workers consume at a steady rate the backend can handle. This pattern is fundamental to resilient architectures: image processing queues, email sending queues, payment processing queues, and report generation queues all implement load levelling. Combined with backpressure and dead letter queues for robust failure handling.
Common Misconception
Why It Matters
Common Mistakes
- No queue depth monitoring — a growing queue signals consumers cannot keep up.
- Consumer pool too small for normal load — queue perpetually grows instead of draining.
- No dead letter queue for failed jobs — failed messages lost silently.
- Synchronous operations where the queue should be used — user waits for what should be async.
Avoid When
- The consumer cannot keep up with the queue — messages accumulate indefinitely and the queue becomes a backlog.
- The operation requires an immediate response — queues are asynchronous; the caller cannot wait for the result.
- Message ordering is critical and the queue does not guarantee it — use a single-consumer queue or ordered stream.
- The payload is too large for the queue — store large payloads externally and queue only the reference.
When To Use
- Smoothing traffic spikes — absorb burst writes and process them at a sustainable rate.
- Decoupling producers from consumers so either can scale or fail independently.
- Deferring expensive work (email, PDF, video transcoding) out of the HTTP request cycle.
- Retry and dead-letter handling — queues provide built-in retry semantics without application code.
Code Examples
// Direct processing — 10,000 simultaneous requests overwhelm the generator:
class InvoiceController {
public function generate(Order $order): Response {
$pdf = $this->pdfGenerator->generate($order); // CPU-intensive, slow
return response($pdf, 200, ['Content-Type' => 'application/pdf']);
// 10,000 concurrent: PHP-FPM workers exhausted, timeouts
}
}
// Queue-based load levelling:
class InvoiceController {
public function request(Order $order): Response {
GenerateInvoice::dispatch($order->id); // Queue — returns immediately
return response()->json(['status' => 'Invoice being generated, check email']);
}
}
// Queue worker — runs at sustainable rate:
class GenerateInvoiceJob {
public function handle(): void {
$pdf = $this->pdfGenerator->generate($this->order);
$this->emailService->sendInvoice($this->order, $pdf);
// 10,000 orders → all processed, none dropped, at 100/sec
}
}