← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

MySQL GROUP BY and Aggregates

Database Intermediate
debt(d5/e3/b3/t7)
d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). The detection_hints list semgrep as the tool, which is a SAST/specialist tool that can detect SELECT with GROUP BY where some columns are not aggregated. The bug is silent at runtime in permissive mode, so it won't be caught by a compiler or default linter, placing it firmly in the specialist-tool tier.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3). The quick_fix states every column in SELECT must appear in GROUP BY or be wrapped in an aggregate function, and HAVING must be used to filter aggregate results. This is a targeted query rewrite — typically fixing one or a few queries rather than a one-line swap (e1) or a multi-file refactor (e5).

b3 Burden Structural debt — long-term weight of choosing wrong

Closest to 'localised tax' (b3). The applies_to covers web and cli contexts broadly, but the issue is constrained to SQL queries that use GROUP BY. It doesn't permeate architectural decisions across the codebase; it's a recurring query-level concern that each developer must remember when writing aggregation queries, imposing a modest but localised tax.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7). The misconception field states developers believe they can SELECT any column when using GROUP BY, but MySQL silently returns a random value for un-aggregated, un-grouped columns in permissive mode. This contradicts standard SQL behavior and the behavior of other databases (PostgreSQL, SQLite strict mode), and produces non-deterministic results that look correct but aren't — a serious cognitive trap that contradicts how similar concepts work elsewhere.

About DEBT scoring →

Also Known As

GROUP BY MySQL aggregate functions MySQL HAVING vs WHERE COUNT SUM AVG MySQL

TL;DR

GROUP BY collapses rows with matching column values into one — aggregate functions (COUNT, SUM, AVG, MAX, MIN) compute values per group.

Explanation

GROUP BY is used with aggregate functions to summarise data. MySQL's ONLY_FULL_GROUP_BY mode (default since 5.7) requires all SELECT columns to be in GROUP BY or wrapped in an aggregate — preventing non-deterministic results. HAVING filters groups after aggregation; WHERE filters rows before aggregation. ROLLUP adds subtotals. Window functions (MySQL 8) provide aggregation without collapsing rows.

Common Misconception

You can SELECT any column when using GROUP BY. Without ONLY_FULL_GROUP_BY enabled, MySQL silently returns a random value for un-aggregated, un-grouped columns.

Why It Matters

Selecting non-aggregated columns without ONLY_FULL_GROUP_BY returns a random value from the group — a silent correctness bug that produces different results on different MySQL versions.

Common Mistakes

  • Using WHERE to filter aggregate results instead of HAVING — WHERE runs before aggregation.
  • Selecting non-aggregated columns not in GROUP BY — non-deterministic on MySQL permissive mode.
  • GROUP BY on an unindexed column — triggers filesort in the EXPLAIN Extra column.

Avoid When

  • Do not use GROUP BY on an unindexed column for large tables — add an index to avoid filesort.
  • Do not use GROUP BY as a workaround for DISTINCT — they have different semantics.

When To Use

  • Use GROUP BY with COUNT/SUM/AVG for reporting and analytics queries.
  • Use HAVING to filter groups — e.g. users with more than 5 orders.

Code Examples

✗ Vulnerable
-- ONLY_FULL_GROUP_BY error in MySQL 5.7+
SELECT user_id, email, COUNT(*) AS orders
FROM orders
GROUP BY user_id; -- email not in GROUP BY or aggregate
✓ Fixed
SELECT user_id,
       COUNT(*)        AS order_count,
       SUM(total)      AS revenue,
       MAX(created_at) AS last_order
FROM orders
GROUP BY user_id
HAVING COUNT(*) > 5
ORDER BY revenue DESC;

Added 31 Mar 2026
Views 48
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 1 ping W 1 ping T 1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T 1 ping F 1 ping S 2 pings S 1 ping M 0 pings T 1 ping W 1 ping T 2 pings F 0 pings S 1 ping S 0 pings M 0 pings T 0 pings W 1 ping T 0 pings F 0 pings S 1 ping S 0 pings M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Google 8 Scrapy 5 ChatGPT 4 Ahrefs 3 SEMrush 3 Perplexity 2 Unknown AI 2 Meta AI 2 Claude 2 Sogou 1 Majestic 1 PetalBot 1
crawler 30 crawler_json 4
DEV INTEL Tools & Severity
⚙ Fix effort: Low
⚡ Quick Fix
Every column in SELECT must appear in GROUP BY or be wrapped in an aggregate function — use HAVING to filter on aggregate results
📦 Applies To
web cli
🔗 Prerequisites
🔍 Detection Hints
SELECT with GROUP BY where some columns are not aggregated
Auto-detectable: ✓ Yes semgrep
⚠ Related Problems
🤖 AI Agent
Confidence: High False Positives: Low ✓ Auto-fixable Fix: Low Context: Line


✓ schema.org compliant