Bulkhead Pattern
debt(d9/e7/b7/t7)
Closest to 'silent in production until users hit it' (d9). The detection_hints indicate automated detection is 'no', and tools listed (php-fpm-status, datadog) require proactive monitoring setup. The absence of bulkheads is invisible at code-review time — the system appears functional until a slow dependency exhausts a shared pool under real production load, at which point users experience degradation across unrelated features.
Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix describes assigning separate PHP-FPM pools or thread pools to different feature areas, but this touches infrastructure configuration, application routing, connection pool management, and monitoring setup across multiple components. It is not a single-file change — it requires redesigning how shared resources are allocated across the entire request-handling layer, often touching deployment config, queue workers, and application code simultaneously.
Closest to 'strong gravitational pull' (e7). The applies_to covers web, api, and queue-worker contexts — nearly every PHP workload. Once bulkheads are in place, every new feature, dependency, or worker must be evaluated for which pool it belongs to. Sizing decisions (noted in common_mistakes) require ongoing attention, and monitoring per-bulkhead utilisation is a persistent operational tax that shapes how new workstreams are designed.
Closest to 'serious trap' (t7). The canonical misconception is that bulkheads only apply to microservices, leading developers to dismiss the pattern entirely in monolithic or single-service contexts where it is equally valuable. This contradicts the intuition that architectural resilience patterns are a distributed-systems concern only, causing competent developers to skip bulkhead design even when sharing a single PHP-FPM pool across workloads with very different SLAs.
Also Known As
TL;DR
Explanation
Named after the watertight compartments in a ship's hull, the Bulkhead pattern prevents a failure in one service from cascading by isolating resource pools (thread pools, connection pools, semaphores) per downstream dependency. If calls to Service A consume all available threads, Service B's thread pool remains unaffected. In PHP architectures, bulkheads are implemented via: separate database connection pools per service, queue workers dedicated to specific job types, rate limits per API client, and circuit breakers per downstream endpoint. Bulkheads complement circuit breakers — circuit breakers detect failure, bulkheads contain its blast radius.
Diagram
flowchart TD
subgraph Thread Pool A - Critical
R1[Request 1] & R2[Request 2] --> TP_A[10 threads<br/>Payment API]
end
subgraph Thread Pool B - Normal
R3[Request 3] & R4[Request 4] --> TP_B[20 threads<br/>Product API]
end
subgraph Thread Pool C - Background
R5[Request 5] --> TP_C[5 threads<br/>Reports API]
end
TP_B -->|Pool B exhausted| REJECT[Reject / Queue<br/>Pool A unaffected]
style TP_A fill:#238636,color:#fff
style TP_B fill:#1f6feb,color:#fff
style TP_C fill:#6e40c9,color:#fff
style REJECT fill:#f85149,color:#fff
Watch Out
Common Misconception
Why It Matters
Common Mistakes
- One shared thread pool or connection pool for all operations — slow calls to one service starve calls to all others.
- Not sizing bulkheads based on actual workload — too small causes unnecessary rejection; too large defeats the isolation.
- Bulkheads without fallbacks — isolation prevents cascading failure but callers still need a degraded-mode response.
- Not monitoring per-bulkhead utilisation — you cannot tune what you cannot observe.
Avoid When
- Avoid over-partitioning small systems — splitting a 20-connection pool into five pools of four creates artificial scarcity.
- Do not apply bulkheads as a substitute for fixing a genuinely slow dependency — isolation contains the damage but does not cure the cause.
When To Use
- Isolate thread/connection pools per downstream dependency — so a slow third-party API cannot starve requests to your database.
- Apply bulkheads when a single shared resource pool services multiple unrelated workloads with different SLAs.
- Use process-level bulkheads (separate workers per queue) to stop a backlogged job type from blocking time-sensitive ones.
Code Examples
// Single shared connection pool — one slow service starves everything:
$pool = new ConnectionPool(maxConnections: 20); // Shared by all services
$userResult = $pool->query('SELECT ...'); // 18 connections consumed by slow report
$orderResult = $pool->query('SELECT ...'); // Starved — waits for pool
$reportResult = $pool->query('SELECT ...'); // Holds all connections for 30s
// Bulkhead — isolate failures to prevent cascading
// Like watertight compartments in a ship
// Thread/process pool isolation in PHP via queue workers:
// payments-worker: max 5 workers — payment failures don't starve emails
// emails-worker: max 3 workers
// reports-worker: max 2 workers
// config/horizon.php
'payment-queue' => [
'connection' => 'redis',
'queue' => ['payments'],
'maxProcesses' => 5,
],
'email-queue' => [
'connection' => 'redis',
'queue' => ['emails'],
'maxProcesses' => 3,
],
// Database connection pool bulkhead:
// Separate connection pools for read vs write vs reporting
// If reporting queries spike, they can't starve transactional connections