Grafana & Dashboards
debt(d7/e3/b5/t5)
Closest to 'only careful code review or runtime testing' (d7). Detection_hints list only 'grafana' itself with automated: no. Dashboard anti-patterns (sprawl, too many panels, missing variables, misuse as alerting source) are not caught by any automated tool — they require deliberate human review of the dashboard configuration or operational experience noticing degraded usability.
Closest to 'simple parameterised fix' (e3). The quick_fix describes creating a golden signals dashboard, adding variables, importing community dashboards, and trimming panels — this is a small but multi-step rework within one component (the dashboard layer), not a single-line patch, but also not cross-cutting. Reorganising dashboards and adding filter variables is contained to Grafana configuration.
Closest to 'persistent productivity tax' (b5). Dashboard sprawl and poor structure slow many work streams — developers, ops, and on-call engineers all rely on dashboards for incident response and routine monitoring. Bad dashboard design taxes every future maintainer who tries to diagnose issues, but the burden is localised to the observability layer and doesn't reshape the whole codebase architecture.
Closest to 'notable trap' (t5). The canonical misconception is explicit: developers assume Grafana stores data, when it only visualises from external data sources (Prometheus, Loki). This is a documented and commonly encountered gotcha — users learn it when they try to rely on Grafana for retention, alerting truth, or data portability — but it doesn't actively cause silent production damage, keeping it at t5 rather than higher.
TL;DR
Explanation
Grafana: visualisation layer for any observability data. Panel types: time series, stat, gauge, histogram, table, heatmap, logs. Variables: dynamic dashboard filtering (environment, service, instance). Alerting: Grafana-managed alerts with multiple notification channels. Data sources: Prometheus, Loki, Elasticsearch, Tempo (traces), InfluxDB, CloudWatch, Datadog. Grafana Cloud: managed Grafana + Prometheus + Loki + Tempo. Best practices: one dashboard per service (golden signals), one overview dashboard per team, use variables for filtering. Export/import dashboards as JSON (share via grafana.com).
Common Misconception
Why It Matters
Common Mistakes
- Too many panels per dashboard — aim for 10-15, not 50.
- No variables for filtering — can't scope to one service or environment.
- Dashboard sprawl — hundreds of dashboards, none maintained.
- Using Grafana as the alerting source of truth — prefer Prometheus alerting rules.
Code Examples
# 50-panel dashboard with no variables:
# One fixed dashboard for all environments, all services
# Service dashboard with variables:
variables:
- name: service
query: label_values(up, job)
- name: env
query: label_values(up, env)
# Four golden signals panels:
# 1. Request rate
# 2. Error rate
# 3. P99 Latency
# 4. Saturation (CPU/Memory)
# Import from grafana.com/dashboards — e.g. Laravel dashboard #12345