Cloud Multi-Tenancy
debt(d9/e7/b9/t7)
Closest to 'silent in production until users hit it' (d9). A missing tenant_id filter or un-scoped cache key produces correct-looking results until one tenant sees another's data. phpstan/semgrep can flag the regex pattern (WHERE id = ? without tenant_id) but detection_hints.automated is 'no', so the safety net is largely silent until a real cross-tenant leak occurs.
Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix sounds simple — inject tenant context and add a global scope or row-level security — but retrofitting tenant scoping onto existing queries, caches, jobs, and migrations touches every data-access path across the codebase, not one component.
Closest to 'defines the system's shape' (b9). The tenancy model (shared-DB vs schema vs DB-per-tenant) is load-bearing across web, queue-worker, and library contexts; why_it_matters states it dictates cost and the blast radius of every change. Switching models later is rewrite-or-live-with-it.
Closest to 'serious trap' (t7). The misconception — that multi-tenancy just means giving each customer a subdomain — is exactly the wrong belief; a competent dev guesses wrong because tenant scoping must pervade queries, caches, jobs and rate limits, contradicting the naive subdomain mental model.
Also Known As
TL;DR
Explanation
Multi-tenancy is an architecture where a single deployment of an application serves many customers, called tenants. Instead of spinning up one full stack per customer, tenants share compute, networking, and often the same database - while the application enforces strict isolation so no tenant can see or affect another's data. This is the dominant SaaS model because it amortises infrastructure cost and operational effort across the whole customer base.
Isolation can sit at several layers. Database-level options range from a shared database with a tenant_id column on every row (cheapest, hardest to isolate), to a schema per tenant, to a full database per tenant (strongest isolation, highest overhead). Application-level isolation requires that every query is scoped by tenant, usually by injecting a tenant context early in the request lifecycle (from subdomain, header, or JWT claim) and threading it through repositories and caches.
The central risk is cross-tenant data leakage: a missing WHERE tenant_id = ? clause, a cache key that omits the tenant, or a background job that runs without tenant scope can expose one customer's data to another. These bugs are catastrophic for trust and often violate contractual and regulatory boundaries. Robust designs make tenant scoping the default rather than something each query opts into - global query scopes, row-level security in PostgreSQL, or a connection-per-tenant strategy that physically separates data.
Noisy-neighbour effects are the other recurring problem: one heavy tenant consuming shared CPU, connections, or rate limits degrades everyone. Mitigations include per-tenant quotas, connection pool partitioning, and tiering large tenants onto dedicated infrastructure. Choosing a tenancy model is a trade-off between cost efficiency, isolation strength, and per-tenant customisation, and it is expensive to change later, so decide deliberately up front.
Common Misconception
Why It Matters
Common Mistakes
- Forgetting the tenant_id filter on a query, exposing another tenant's rows.
- Cache keys that omit the tenant, so one customer sees another's cached data.
- Background jobs and migrations that run without tenant context.
- No per-tenant quotas, letting one heavy tenant degrade everyone (noisy neighbour).
- Picking a shared-database model for customers with strict isolation or compliance needs.
Avoid When
- A single customer needs full physical isolation for compliance - a dedicated single-tenant deployment is simpler and safer.
- Heavy per-tenant customisation of schema or behaviour would require constant branching of shared code.
- The team cannot guarantee tenant scoping is enforced by default across queries, caches, and jobs.
When To Use
- Building SaaS that must serve many customers cost-effectively from one codebase.
- Tenants share the same feature set with data isolation as the main requirement.
- You can enforce tenant context centrally via middleware, global scopes, or row-level security.
Code Examples
<?php
// Tenant filter is opt-in - easy to forget, leaks across tenants
class InvoiceRepository
{
public function __construct(private PDO $db) {}
public function find(int $id): ?array
{
// No tenant scoping at all - any tenant can read any invoice
$stmt = $this->db->prepare('SELECT * FROM invoices WHERE id = ?');
$stmt->execute([$id]);
return $stmt->fetch(PDO::FETCH_ASSOC) ?: null;
}
}
<?php
// Tenant context resolved once, enforced on every query
final class TenantContext
{
public function __construct(public readonly int $tenantId) {}
}
class InvoiceRepository
{
public function __construct(
private PDO $db,
private TenantContext $tenant,
) {}
public function find(int $id): ?array
{
$stmt = $this->db->prepare(
'SELECT * FROM invoices WHERE id = ? AND tenant_id = ?'
);
$stmt->execute([$id, $this->tenant->tenantId]);
return $stmt->fetch(PDO::FETCH_ASSOC) ?: null;
}
}
// Even stronger: PostgreSQL RLS so the database rejects un-scoped reads.