JSON Columns in MySQL & PostgreSQL
debt(d7/e5/b5/t5)
Closest to 'only careful code review or runtime testing' (d7). Detection hints list mysql-slow-query-log and laravel-debugbar, both runtime tools that only reveal problems after queries execute slowly. There's no static analysis that catches 'you should have normalized this' or 'you're missing a GIN index on this JSON path' — it requires query profiling or performance testing to surface.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix says to add generated columns and indexes, which is a schema migration plus potentially query rewrites. If data was wrongly stored as JSON when it should be normalized, that's extracting JSON fields into proper columns, migrating data, and updating all queries — a significant refactor within the data layer.
Closest to 'persistent productivity tax' (b5). Applies to web/cli/queue contexts broadly. Once JSON columns hold data that should be relational, every future query against that data is slower, every join is awkward, and every developer must learn the JSON path syntax. It's not quite system-defining (b7) but does create ongoing friction across multiple workstreams when misused.
Closest to 'notable trap (a documented gotcha most devs eventually learn)' (t5). The misconception states JSON columns appear to replace proper schema design, tempting devs to avoid migrations. This is a known gotcha — experienced devs learn it, but it contradicts the initial appeal of 'just store everything in JSON'. The jsonb vs json confusion in PostgreSQL adds another documented trap layer.
Also Known As
TL;DR
Explanation
MySQL 5.7+ and PostgreSQL 9.4+ (jsonb) support native JSON storage. PostgreSQL's jsonb stores binary-parsed JSON enabling GIN indexes on any path. MySQL's JSON type provides path operators (->>, JSON_EXTRACT) and generated columns for indexing. JSON columns are useful for variable attributes, configuration, and metadata — but abusing them to avoid schema design produces an unmaintainable document store inside a relational database.
Diagram
flowchart LR
subgraph MySQL JSON Column
MJ[JSON column] -->|JSON_EXTRACT| MQ[Query inside JSON]
MJ -->|Generated column + index| MI[Index on JSON path]
end
subgraph PostgreSQL JSONB
PJ[JSONB column<br/>binary stored] -->|"-> ->>"| PQ[Navigate JSON]
PJ -->|GIN index| PI[Fast search inside JSON]
PJ -->|"@>"| CONTAIN[Contains operator]
end
subgraph When to Use
USE[Semi-structured data<br/>variable attributes per row<br/>schema not yet known]
AVOID[Highly structured data<br/>frequently queried fields<br/>foreign key relations needed]
end
style USE fill:#238636,color:#fff
style AVOID fill:#f85149,color:#fff
Common Misconception
Why It Matters
Common Mistakes
- Using JSON for data that has a fixed, known structure — a proper column is faster, indexable, and type-safe.
- Not creating a GIN index (PostgreSQL) or generated column index (MySQL) on frequently queried JSON paths.
- Querying JSON with LIKE '%value%' — always use the database's native JSON path operators.
- Mixing jsonb (binary, fast) and json (text, preserves whitespace) in PostgreSQL without understanding the difference.
Code Examples
-- Querying JSON with LIKE — full table scan, no index:
SELECT * FROM users WHERE metadata LIKE '%"role":"admin"%';
-- MySQL: JSON_EXTRACT without generated column — not indexed:
SELECT * FROM users WHERE JSON_EXTRACT(metadata, '$.role') = 'admin';
-- PostgreSQL: GIN index on jsonb + path operator:
CREATE INDEX idx_users_metadata ON users USING GIN (metadata);
SELECT * FROM users WHERE metadata @> '{"role": "admin"}';
-- MySQL: generated column + index:
ALTER TABLE users
ADD COLUMN role VARCHAR(50) GENERATED ALWAYS AS (metadata->>'$.role') STORED,
ADD INDEX idx_users_role (role);