MySQL JSON Column Type
debt(d7/e5/b7/t7)
Closest to 'only careful code review or runtime testing' (d7). The detection_hints explicitly state 'automated: no' and the pattern is 'JSON_EXTRACT or -> operator used in WHERE without generated column index'. No standard linter or SAST tool flags this automatically — a developer must review query plans (EXPLAIN) or observe slow queries in production to catch the full-table-scan issue.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix says to add indexed generated columns for filtered JSON paths, but if normalised data has been stored as JSON (a common mistake), this requires migrating schema, updating queries across potentially many files, restoring referential integrity, and possibly rewriting data access logic — well beyond a single-line patch but short of a full architectural rework.
Closest to 'strong gravitational pull' (e7). Once JSON columns are used for data that should be normalised (user IDs, fixed attributes), every future query touching that data must use JSON_EXTRACT or the -> operator, indexing strategies become constrained by generated columns, foreign key relationships are lost, and all downstream consumers inherit the complexity. The choice shapes query design and schema evolution across the codebase.
Closest to 'serious trap' (t7). The misconception field states the canonical wrong belief directly: 'Storing everything in a JSON column avoids the need for proper schema design.' This contradicts the relational mental model that competent developers bring — they expect columns to be indexable and referenceable, but JSON columns silently break both assumptions. The 'obvious' move (just use JSON for flexibility) leads to full-table scans and lost integrity without any immediate warning.
Also Known As
TL;DR
Explanation
JSON columns store validated JSON and support path-based extraction with JSON_EXTRACT() or the -> shorthand operator. Generated columns can index specific JSON paths: ALTER TABLE ADD COLUMN email VARCHAR(255) GENERATED ALWAYS AS (data->>'$.email'). JSON is not a replacement for normalised columns — it's for genuinely variable-structure data. JSON queries cannot use standard B-tree indexes without generated column tricks. MySQL 8 added JSON_TABLE() to pivot JSON arrays into relational rows.
Common Misconception
Why It Matters
Common Mistakes
- Storing normalised data (user IDs, fixed attributes) as JSON — losing referential integrity and index efficiency.
- Filtering by JSON path without a generated column index — causes full table scans.
- Not validating JSON structure in PHP before inserting — MySQL validates only that it's valid JSON, not that required fields exist.
Avoid When
- Do not store normalised relational data (user IDs, fixed attributes) in JSON — you lose indexes, foreign keys, and type safety.
- Do not use JSON as a substitute for proper schema design — it should be the exception, not the default.
When To Use
- Use JSON columns for genuinely variable-structure data like product attributes, user preferences, or event payloads.
- Add a generated column index on any JSON path you regularly filter or sort by.
Code Examples
-- Storing normalised data as JSON — loses FK, indexes, and type safety
INSERT INTO orders SET data = '{"user_id": 42, "total": 99.99, "items": [...]}';
-- JSON column for variable attributes
CREATE TABLE products (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
attributes JSON
) ENGINE=InnoDB;
-- Generated column index for common JSON path
ALTER TABLE products
ADD COLUMN brand VARCHAR(100) GENERATED ALWAYS AS (attributes->>'$.brand') STORED,
ADD INDEX idx_brand (brand);
-- Query JSON path
SELECT name, attributes->>'$.colour' FROM products WHERE brand = 'Acme';