← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Four Golden Signals

observability Beginner

TL;DR

Google SRE's Four Golden Signals — Latency, Traffic, Errors, Saturation — are the four metrics that, if monitored and alerted on, cover most production reliability concerns.

Explanation

(1) Latency: time to serve a request — distinguish successful vs error latency (errors should be fast, not slow). (2) Traffic: demand on the system — requests/sec, concurrent users, messages/sec. (3) Errors: rate of failed requests — 5xx responses, uncaught exceptions, failed jobs. (4) Saturation: how full the system is — CPU%, memory%, queue depth, disk I/O. Also: USE (Utilisation, Saturation, Errors) for resources; RED (Rate, Errors, Duration) for services. Start with these four before adding more metrics. Any one of these trending badly = something is wrong.

Common Misconception

More metrics are always better — start with the four golden signals. Adding metrics without alerts just creates dashboard noise.

Why It Matters

The four golden signals provide a complete picture of system health from the user's perspective — if these four are green, the service is likely working correctly.

Common Mistakes

  • Only monitoring uptime — not latency, errors, or saturation.
  • Monitoring p50 latency but not p99 — p99 reveals the slow tail that users experience.
  • No saturation monitoring — running out of CPU/memory/connections causes gradual degradation.

Code Examples

✗ Vulnerable
// Only uptime monitoring:
alert: ServiceDown
expr: up == 0
// Misses: slow responses, error rates, resource exhaustion
✓ Fixed
// Latency:
- alert: HighLatency
  expr: histogram_quantile(0.99, rate(http_duration_seconds_bucket[5m])) > 0.5
// Errors:
- alert: HighErrorRate
  expr: rate(http_requests_total{status=~'5..'}[5m]) / rate(http_requests_total[5m]) > 0.01
// Saturation:
- alert: HighMemory
  expr: process_resident_memory_bytes / node_memory_total_bytes > 0.9

Added 23 Mar 2026
Views 30
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings W 0 pings T 0 pings F 1 ping S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T 1 ping F 1 ping S 0 pings S 1 ping M 1 ping T 0 pings W 0 pings T 1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T
No pings yesterday
Amazonbot 7 Perplexity 7 Unknown AI 3 Ahrefs 2 Google 2 SEMrush 2 ChatGPT 1 Majestic 1
crawler 22 crawler_json 1 pre-tracking 2
DEV INTEL Tools & Severity
🟠 High ⚙ Fix effort: Medium
⚡ Quick Fix
Add alerts for all four signals: latency p99 > threshold, error rate > 1%, traffic anomaly, CPU/memory/queue saturation. Use p99, not average.
📦 Applies To
web cli queue-worker
🔗 Prerequisites
🔍 Detection Hints
Auto-detectable: ✗ No prometheus grafana datadog
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: High ✗ Manual fix Fix: Medium Context: File

✓ schema.org compliant