Auto Scaling
debt(d7/e7/b7/t7)
Closest to 'only careful code review or runtime testing' (d7). The detection_hints show automated=no and the tools listed (kubernetes, aws-asg, cloudwatch) are infrastructure monitoring tools rather than code analyzers. Misconfigurations like scaling on CPU alone, missing scale-in protection, or instance-local sessions only surface during production traffic spikes or when scale-in events terminate jobs mid-processing. No static analysis can catch these architectural decisions.
Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix mentions configuring HPA with custom PHP-FPM metrics and stabilisation windows, but the common_mistakes reveal deeper fixes needed: moving sessions to Redis requires application-level changes, adding warm-up periods requires deployment pipeline changes, implementing scale-in protection requires job handling refactors, and adding custom metrics requires instrumentation across workers. These span infrastructure, application code, and deployment configurations.
Closest to 'strong gravitational pull' (b7). Auto scaling applies to web and queue-worker contexts per applies_to, meaning it shapes how you design session handling, job processing, deployment pipelines, and metrics collection. Every future architectural decision must account for instances being ephemeral and traffic-dependent. The common_mistakes show how this choice forces specific patterns: external session storage, graceful shutdown handling, and opcache warming strategies.
Closest to 'serious trap' (t7). The misconception explicitly states 'More instances always solves performance problems' when the bottleneck may be the database. This contradicts the intuitive mental model that horizontal scaling = more capacity. Developers familiar with scaling concepts from other contexts may assume adding instances uniformly improves throughput, when in reality it can increase DB connection pressure and worsen the actual bottleneck. The common_mistakes reinforce multiple non-obvious gotchas.
Also Known As
TL;DR
Explanation
Horizontal scaling adds/removes instances; vertical scaling changes instance size. Auto scaling groups (AWS ASG, GCE managed instance groups) scale horizontally based on CloudWatch metrics (CPU, request rate, queue depth). Scale-out policies add capacity when metrics exceed thresholds; scale-in policies remove capacity when load drops. Cooldown periods prevent oscillation. For PHP-FPM, scaling means adding more servers — each request is stateless, so any server can handle any request.
Diagram
flowchart TD
MON[CloudWatch<br/>Metrics] -->|CPU > 70%| SCALE_OUT[Scale Out<br/>Add instances]
MON -->|CPU < 20%| SCALE_IN[Scale In<br/>Remove instances]
subgraph Auto Scaling Group
S1[Instance 1]
S2[Instance 2]
S3[Instance 3 - new]
end
LB[Load Balancer] --> S1 & S2
SCALE_OUT --> S3
LB --> S3
SESS[(Redis Sessions<br/>Shared)] -.->|stateless app| S1 & S2 & S3
style S3 fill:#238636,color:#fff
style SESS fill:#1f6feb,color:#fff
Common Misconception
Why It Matters
Common Mistakes
- Scaling on CPU alone — request rate or queue depth is often a better signal for PHP-FPM workloads.
- No scale-in protection for instances processing long-running jobs — scale-in terminates them mid-job.
- Sessions stored on individual instances — scale-in removes the instance and all its sessions; use Redis for session storage.
- No warm-up period — new instances need time to compile opcache before handling production traffic.
Code Examples
# Sticky sessions on application servers — blocks scale-in:
# nginx upstream:
upstream app {
ip_hash; # Sticky sessions — user X always goes to server 1
server app1:80;
server app2:80;
}
# Scale-in removes app1 — all of app1's sessions lost
# Stateless app servers + Redis sessions = scale freely:
# php.ini / app config:
session.save_handler = redis
session.save_path = 'tcp://redis.internal:6379'
# nginx — no sticky sessions needed:
upstream app {
least_conn; # Route to least busy server
server app1:80;
server app2:80;
}