Observability (Logs, Metrics, Traces)
Also Known As
system observability
logs metrics traces
three pillars observability
TL;DR
The ability to understand a system's internal state from its external outputs — built on three pillars: logs, metrics, and distributed traces.
Explanation
Observability (as opposed to monitoring) is the degree to which a system's internal state can be inferred from its outputs. The three pillars: Logs (timestamped event records — structured JSON logs are queryable), Metrics (numeric time-series data — request rate, error rate, latency, resource usage), and Traces (end-to-end request journeys across services, correlated by trace ID). Tools: Prometheus + Grafana (metrics), ELK / Loki (logs), Jaeger / Zipkin / OpenTelemetry (traces). PHP applications emit structured logs via Monolog, expose metrics via /metrics endpoints, and propagate trace context via OpenTelemetry SDK.
Common Misconception
✗ Observability is just a modern word for monitoring. Monitoring tracks known failure modes with predefined dashboards and alerts. Observability is the ability to understand arbitrary system states from outputs — logs, metrics, and traces — enabling diagnosis of novel failures that were never anticipated.
Why It Matters
Observability — metrics, logs, and traces — lets you understand system behaviour from the outside. A system that cannot be observed cannot be debugged or improved reliably.
Common Mistakes
- Logging everything at DEBUG level in production — log volume makes finding real issues impossible.
- Metrics without context — a spike in CPU is meaningless without correlated request rate and error rate.
- Structured logging not implemented — log parsing tools cannot extract fields from unstructured log lines.
- No correlation between metrics, logs, and traces — cannot connect a metric spike to its cause in logs.
Code Examples
✗ Vulnerable
// Unstructured log — cannot be parsed or searched reliably:
error_log('User 42 failed to login from 192.168.1.1 at ' . date('Y-m-d H:i:s'));
// Structured JSON log — searchable and filterable:
error_log(json_encode([
'event' => 'login_failed', 'user_id' => 42,
'ip' => '192.168.1.1', 'timestamp' => date('c')
]));
✓ Fixed
// Structured logging with context
$this->logger->info('Order placed', [
'order_id' => $order->id,
'user_id' => $order->userId,
'total_cents' => $order->total,
'duration_ms' => $elapsed,
]);
// Metric increment (Prometheus via StatsD)
$this->metrics->increment('orders.placed', ['status' => 'success']);
$this->metrics->histogram('orders.checkout_duration_ms', $elapsed);
// Trace span (OpenTelemetry)
$span = $tracer->spanBuilder('checkout')->startSpan();
try { /* work */ } finally { $span->end(); }
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
15 Mar 2026
Edited
22 Mar 2026
Views
31
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 1
No pings yesterday
Amazonbot 8
Perplexity 8
SEMrush 3
Ahrefs 2
Unknown AI 2
Google 2
Majestic 1
Also referenced
How they use it
crawler 24
crawler_json 2
Related categories
⚡
DEV INTEL
Tools & Severity
🟡 Medium
⚙ Fix effort: High
⚡ Quick Fix
Instrument the three pillars: structured logs (Monolog JSON), metrics (Prometheus /metrics endpoint), and traces (OpenTelemetry auto-instrumentation) — you need all three
📦 Applies To
PHP 5.0+
web
cli
queue-worker
🔗 Prerequisites
🔍 Detection Hints
No metrics endpoint; no distributed tracing; logs are unstructured plain text with no correlation IDs
Auto-detectable:
✗ No
opentelemetry
prometheus
datadog
grafana
⚠ Related Problems
🤖 AI Agent
Confidence: Low
False Positives: High
✗ Manual fix
Fix: High
Context: File