← Back to glossary

Database Normalisation

General PHP 5.0+ Intermediate

debt(d7/e7/b7/t5)

d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). The detection_hints indicate automated detection is 'no' — tools like mysql-workbench and pgadmin can visualize schema but don't automatically flag normalisation violations. Identifying duplicate data across tables, missing FK relationships, or JSON columns storing related values requires manual schema review or data quality audits. No compiler or linter catches this.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix says 'normalise to 3NF' but achieving this after denormalised data exists requires schema migrations, splitting tables, establishing FK relationships, and updating all application code that reads/writes the affected tables. Common_mistakes like 'repeating customer name in every order row' means fixing requires data migration across potentially millions of rows plus application-wide query changes.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (b7). Database schema normalisation is a foundational architectural decision that shapes every query, every ORM model, and every data access pattern. The applies_to shows it affects all contexts (web, cli). Once you have denormalised data spread across the system, every new feature must work around or perpetuate the inconsistency. It's not quite b9 (rewrite-required) but close — you can incrementally normalise with effort.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap' (t5). The misconception field explicitly states developers think normalisation is 'always about database normal forms' when it also means canonical representation (E.164 phones, lowercase emails, UTC dates). Common_mistakes include over-normalising causing JOIN-heavy queries — the trap is both under-normalising (data anomalies) and over-normalising (performance problems). Most experienced devs eventually learn the balance, but it's a documented gotcha.

About DEBT scoring → scored by claude-opus-4-5-20251101 · 2026-05-07 · reviewed by human

Also Known As

data normalisation general data consistency canonical data

TL;DR

Organising relational database tables to reduce redundancy and improve integrity — from 1NF through 3NF (and beyond) as design guidelines.

Explanation

Normalisation progressively eliminates redundancy: 1NF (atomic column values — no repeating groups or arrays in cells), 2NF (eliminate partial dependencies — every non-key column depends on the whole primary key, not just part of it), 3NF (eliminate transitive dependencies — non-key columns depend only on the primary key, not on other non-key columns). BCNF and 4NF handle edge cases. Over-normalisation leads to excessive JOINs degrading read performance — denormalise deliberately when queries justify it, documented as an explicit trade-off. For PHP applications using ORMs, normalisation affects relationship mapping (hasMany, belongsToMany) and eager-loading strategies.

Common Misconception

✗ Data normalisation is always about database normal forms. In a broader context, normalisation means establishing a single canonical representation — normalising phone numbers to E.164 format, emails to lowercase, or dates to UTC before storage prevents comparison and deduplication failures.

Why It Matters

Database normalisation eliminates redundant data — duplicated values cause update anomalies where changing one instance misses others, leading to inconsistent data across the system.

Common Mistakes

Storing calculated values that can be derived from existing columns — they go out of sync.
Repeating customer name and address in every order row — a customer name change requires updating thousands of rows.
Over-normalising to the point where simple queries require many JOINs — pragmatic denormalisation is sometimes correct.
Not understanding 1NF, 2NF, 3NF — normalising to 3NF solves most practical anomalies.

Code Examples

✗ Vulnerable

-- 1NF violation — repeating groups in one column
CREATE TABLE orders (
    id INT,
    tags VARCHAR(255)  -- 'urgent,fragile,priority' -- comma-list is not atomic
);

✓ Fixed

-- 3NF design: separate tables, no transitive dependencies
CREATE TABLE orders (id INT PRIMARY KEY, customer_id INT, total DECIMAL);
CREATE TABLE customers (id INT PRIMARY KEY, name VARCHAR(100), city_id INT);
CREATE TABLE cities (id INT PRIMARY KEY, name VARCHAR(100), country_id INT);

-- Tags as many-to-many (junction table)
CREATE TABLE tags (id INT PRIMARY KEY, name VARCHAR(50));
CREATE TABLE order_tags (order_id INT, tag_id INT, PRIMARY KEY (order_id, tag_id));

-- Denormalise only with evidence: materialised views or computed columns
-- for read-heavy aggregates, not as the default

Tags

general database data-quality

Added 15 Mar 2026

Edited 22 Mar 2026

Curated in Warsaw under one editorial standard. 1,506 terms, single voice. About this reference →

Rate this term

No ratings yet

🤖 AI Guestbook educational data only

| |

Last 30 days

Agents 0

No pings yet today

No pings yesterday

Scrapy 13 Amazonbot 7 Perplexity 6 Google 4 Ahrefs 4 Unknown AI 3 SEMrush 3 Claude 2 ChatGPT 2 Bing 1 Meta AI 1 Qwen 1 PetalBot 1

Also referenced

Database Indexing 92 Query Optimisation 68 Eager Loading 45 Materialized Views 25

How they use it

crawler 43 crawler_json 4 pre-tracking 1

Related categories

performance 3.3k general 3k

⚡ DEV INTEL Tools & Severity

🟡 Medium ⚙ Fix effort: High

⚡ Quick Fix

Normalise to 3NF for transactional data — each non-key column depends only on the whole primary key; denormalise intentionally for read-heavy query performance with documented trade-offs

📦 Applies To

PHP 5.0+ web cli

🔗 Prerequisites

Database Indexing Database Schema Design ACID Properties

🔍 Detection Hints

Duplicate data in multiple tables with no FK relationship; JSON column storing multiple related values; repeated group of columns (address1 address2 address3)

Auto-detectable: ✗ No mysql-workbench pgadmin

⚠ Related Problems

Database Indexing Database Schema Design

🤖 AI Agent

Confidence: Medium False Positives: Medium ✗ Manual fix Fix: High Context: File Tests: Update

References

https://en.wikipedia.org/wiki/Database_normalization