← Back to glossary

Grafana & Dashboards

Observability Beginner

debt(d7/e3/b5/t5)

d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). Detection_hints list only 'grafana' itself with automated: no. Dashboard anti-patterns (sprawl, too many panels, missing variables, misuse as alerting source) are not caught by any automated tool — they require deliberate human review of the dashboard configuration or operational experience noticing degraded usability.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3). The quick_fix describes creating a golden signals dashboard, adding variables, importing community dashboards, and trimming panels — this is a small but multi-step rework within one component (the dashboard layer), not a single-line patch, but also not cross-cutting. Reorganising dashboards and adding filter variables is contained to Grafana configuration.

b5 Burden Structural debt — long-term weight of choosing wrong

Closest to 'persistent productivity tax' (b5). Dashboard sprawl and poor structure slow many work streams — developers, ops, and on-call engineers all rely on dashboards for incident response and routine monitoring. Bad dashboard design taxes every future maintainer who tries to diagnose issues, but the burden is localised to the observability layer and doesn't reshape the whole codebase architecture.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap' (t5). The canonical misconception is explicit: developers assume Grafana stores data, when it only visualises from external data sources (Prometheus, Loki). This is a documented and commonly encountered gotcha — users learn it when they try to rely on Grafana for retention, alerting truth, or data portability — but it doesn't actively cause silent production damage, keeping it at t5 rather than higher.

About DEBT scoring → scored by claude-sonnet-4-6 · 2026-05-08 · reviewed by human

TL;DR

Grafana is the de facto open-source dashboard platform — connecting to Prometheus, Loki, Elasticsearch, and 50+ data sources to visualise metrics, logs, and traces in a unified UI.

Explanation

Grafana: visualisation layer for any observability data. Panel types: time series, stat, gauge, histogram, table, heatmap, logs. Variables: dynamic dashboard filtering (environment, service, instance). Alerting: Grafana-managed alerts with multiple notification channels. Data sources: Prometheus, Loki, Elasticsearch, Tempo (traces), InfluxDB, CloudWatch, Datadog. Grafana Cloud: managed Grafana + Prometheus + Loki + Tempo. Best practices: one dashboard per service (golden signals), one overview dashboard per team, use variables for filtering. Export/import dashboards as JSON (share via grafana.com).

Common Misconception

✗ Grafana stores data — it only visualises. Grafana reads from data sources (Prometheus, Loki) but stores no metrics or logs itself.

Why It Matters

Grafana dashboards provide the operational UI that makes metrics and logs actionable — without visualisation, raw numbers in Prometheus are hard to interpret.

Common Mistakes

Too many panels per dashboard — aim for 10-15, not 50.
No variables for filtering — can't scope to one service or environment.
Dashboard sprawl — hundreds of dashboards, none maintained.
Using Grafana as the alerting source of truth — prefer Prometheus alerting rules.

Code Examples

✗ Vulnerable

# 50-panel dashboard with no variables:
# One fixed dashboard for all environments, all services

✓ Fixed

# Service dashboard with variables:
variables:
  - name: service
    query: label_values(up, job)
  - name: env
    query: label_values(up, env)

# Four golden signals panels:
# 1. Request rate
# 2. Error rate
# 3. P99 Latency
# 4. Saturation (CPU/Memory)

# Import from grafana.com/dashboards — e.g. Laravel dashboard #12345

References

https://grafana.com/docs/grafana/latest/