Observability terms
You cannot fix what you cannot see
Running software in production without observability is flying blind. Metrics, logs, traces, dashboards, alerting, and the three pillars of observability give you the insight needed to understand what your system is actually doing under real conditions. This category covers the tools, patterns, and vocabulary of production-grade visibility.
Correlation ID Pattern
A correlation ID is a unique identifier attached to every request and propagated through all logs, services, and queues — enabling end-to-end request tracing through string search.
2mo ago
observability beginner
Four Golden Signals
Google SRE's Four Golden Signals — Latency, Traffic, Errors, Saturation — are the four metrics that, if monitored and alerted on, cover most production reliability concerns.
2mo ago
observability beginner
Grafana & Dashboards
Grafana is the de facto open-source dashboard platform — connecting to Prometheus, Loki, Elasticsearch, and 50+ data sources to visualise metrics, logs, and traces in a unified UI.
2mo ago
observability beginner
Health Check Patterns
Health checks report service status to load balancers and orchestrators — /health/live (is the process running?), /health/ready (can it serve traffic?), and deep health checks for dependencies.
2mo ago
observability beginner
Log Levels & When to Use Each
Log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL) communicate severity — use the right level so alerts fire on real issues and noise doesn't mask real problems.
2mo ago
observability beginner
On-Call & Runbooks
A runbook documents how to diagnose and resolve specific alerts — on-call engineers shouldn't have to think from scratch at 3am; the runbook provides the playbook.
2mo ago
observability beginner
P50/P95/P99 Latency Percentiles
Latency percentiles (P50, P95, P99) tell you what most users experience — P99 means '99% of requests are faster than this', revealing the worst experiences that averages hide.
2mo ago
observability beginner
Spans & Traces
A trace is one request's full journey; spans are the individual operations within it — each span has a name, start time, duration, status, and optional attributes.
2mo ago
observability beginner
Automatically capturing, grouping, and alerting on application errors in production — with full stack traces, breadcrumbs, and user context.
2mo ago
observability beginner