Canary Deployment & Observability
debt(d7/e5/b5/t5)
Closest to 'only careful code review or runtime testing' (d7). Tools listed are flagger and argo-rollouts, marked automated: no, meaning misconfiguration (missing version labels, too-short canary period, manual-only rollback) won't be caught by automated scanning — it surfaces only when a bad deploy isn't caught in production or through deliberate review of canary configuration and metrics.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix requires tagging all metrics with version labels (touching instrumentation across services), configuring canary vs stable comparison logic, setting auto-rollback thresholds, and tuning canary duration — spanning deployment config, metrics pipelines, and alerting rules across multiple concerns.
Closest to 'persistent productivity tax' (b5). Canary deployment with observability applies broadly to web deployments and requires ongoing discipline: every new service or metric must carry version labels, canary windows must be respected, and rollback logic must be maintained. This is a persistent operational tax on all deployments, though it doesn't define the entire system shape.
Closest to 'notable trap' (t5). The misconception is explicit: developers commonly treat canary as mere traffic splitting (like blue-green) rather than active metric comparison with automated rollback. This is a well-documented gotcha — the 'obvious' interpretation misses the core value and leads to common mistakes like no version labels and manual-only rollback.
TL;DR
Explanation
Canary: deploy new version to 1-5% of traffic. Monitor: error rate, latency p99, business metrics (conversion, payment success). Automated rollback if canary metrics degrade vs baseline. Implementation: load balancer weights (nginx upstream, AWS ALB weighted target groups), Kubernetes (Argo Rollouts, Flagger), feature flags. Key metrics to compare: error rate, latency (all percentiles), business KPIs. Duration: 10-30 minutes for fast feedback, hours for low-traffic metrics. Observability requirement: metrics must be tagged with version to split canary vs stable. Autocanary: Flagger automates comparison and rollback.
Common Misconception
Why It Matters
Common Mistakes
- No version labels on metrics — can't compare canary vs stable.
- Canary period too short — not enough traffic for statistical significance.
- Manual rollback only — automate based on error rate threshold.
Code Examples
# Deploy to 5% but no metric comparison:
upstream backend {
server stable weight=95;
server canary weight=5;
}
# No monitoring — how do you know if canary is healthy?
# Flagger automated canary:
analysis:
interval: 1m
threshold: 5
metrics:
- name: request-success-rate
thresholdRange:
min: 99
- name: request-duration
thresholdRange:
max: 500
# Auto-rollback if success rate < 99% or p99 > 500ms