← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

MySQL JSON Column Type

Database Intermediate
debt(d7/e5/b7/t7)
d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). The detection_hints explicitly state 'automated: no' and the pattern is 'JSON_EXTRACT or -> operator used in WHERE without generated column index'. No standard linter or SAST tool flags this automatically — a developer must review query plans (EXPLAIN) or observe slow queries in production to catch the full-table-scan issue.

e5 Effort Remediation debt — work required to fix once spotted

Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix says to add indexed generated columns for filtered JSON paths, but if normalised data has been stored as JSON (a common mistake), this requires migrating schema, updating queries across potentially many files, restoring referential integrity, and possibly rewriting data access logic — well beyond a single-line patch but short of a full architectural rework.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (e7). Once JSON columns are used for data that should be normalised (user IDs, fixed attributes), every future query touching that data must use JSON_EXTRACT or the -> operator, indexing strategies become constrained by generated columns, foreign key relationships are lost, and all downstream consumers inherit the complexity. The choice shapes query design and schema evolution across the codebase.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7). The misconception field states the canonical wrong belief directly: 'Storing everything in a JSON column avoids the need for proper schema design.' This contradicts the relational mental model that competent developers bring — they expect columns to be indexable and referenceable, but JSON columns silently break both assumptions. The 'obvious' move (just use JSON for flexibility) leads to full-table scans and lost integrity without any immediate warning.

About DEBT scoring →

Also Known As

MySQL JSON column JSON_EXTRACT MySQL MySQL JSON type -> operator MySQL

TL;DR

A native JSON storage type introduced in MySQL 5.7 that validates, indexes, and queries JSON documents with dedicated functions.

Explanation

JSON columns store validated JSON and support path-based extraction with JSON_EXTRACT() or the -> shorthand operator. Generated columns can index specific JSON paths: ALTER TABLE ADD COLUMN email VARCHAR(255) GENERATED ALWAYS AS (data->>'$.email'). JSON is not a replacement for normalised columns — it's for genuinely variable-structure data. JSON queries cannot use standard B-tree indexes without generated column tricks. MySQL 8 added JSON_TABLE() to pivot JSON arrays into relational rows.

Common Misconception

Storing everything in a JSON column avoids the need for proper schema design. JSON columns cannot be indexed directly, foreign keys cannot reference JSON values, and complex queries become difficult to optimise.

Why It Matters

JSON columns store structured metadata without requiring schema changes — useful for product attributes, user preferences, and event payloads. But overusing JSON turns a relational database into a document store with none of the advantages of either.

Common Mistakes

  • Storing normalised data (user IDs, fixed attributes) as JSON — losing referential integrity and index efficiency.
  • Filtering by JSON path without a generated column index — causes full table scans.
  • Not validating JSON structure in PHP before inserting — MySQL validates only that it's valid JSON, not that required fields exist.

Avoid When

  • Do not store normalised relational data (user IDs, fixed attributes) in JSON — you lose indexes, foreign keys, and type safety.
  • Do not use JSON as a substitute for proper schema design — it should be the exception, not the default.

When To Use

  • Use JSON columns for genuinely variable-structure data like product attributes, user preferences, or event payloads.
  • Add a generated column index on any JSON path you regularly filter or sort by.

Code Examples

✗ Vulnerable
-- Storing normalised data as JSON — loses FK, indexes, and type safety
INSERT INTO orders SET data = '{"user_id": 42, "total": 99.99, "items": [...]}';
✓ Fixed
-- JSON column for variable attributes
CREATE TABLE products (
    id         INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    name       VARCHAR(255) NOT NULL,
    attributes JSON
) ENGINE=InnoDB;

-- Generated column index for common JSON path
ALTER TABLE products
    ADD COLUMN brand VARCHAR(100) GENERATED ALWAYS AS (attributes->>'$.brand') STORED,
    ADD INDEX idx_brand (brand);

-- Query JSON path
SELECT name, attributes->>'$.colour' FROM products WHERE brand = 'Acme';

Added 31 Mar 2026
Views 40
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 0 pings W 0 pings T 0 pings F 0 pings S 2 pings S 0 pings M 0 pings T 0 pings W 1 ping T 2 pings F 1 ping S 1 ping S 1 ping M 0 pings T 0 pings W 0 pings T 1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 2 pings F 1 ping S 0 pings S 0 pings M 1 ping T 0 pings W
No pings yet today
SEMrush 1
Google 7 Perplexity 4 ChatGPT 3 Ahrefs 3 Scrapy 3 Unknown AI 2 Claude 2 Meta AI 2 Bing 1 Sogou 1 Majestic 1 PetalBot 1 SEMrush 1
crawler 26 crawler_json 5
DEV INTEL Tools & Severity
⚙ Fix effort: Medium
⚡ Quick Fix
Use JSON columns for variable-schema attributes — use indexed generated columns for any JSON path you filter or sort by
📦 Applies To
web cli
🔗 Prerequisites
🔍 Detection Hints
JSON_EXTRACT or -> operator used in WHERE without generated column index
Auto-detectable: ✗ No
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: Medium ✗ Manual fix Fix: High Context: File Tests: Regenerate


✓ schema.org compliant