← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

Types of Code Duplication (Clone vs Semantic)

Code Quality Intermediate
debt(d5/e5/b5/t7)
d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). The detection_hints list phpcpd, jscpd, and sonarqube — these are specialist static analysis tools that must be explicitly configured and run in CI. They catch syntactic/clone duplication well, but semantic duplication (same logic, different surface form) is largely invisible to automated tools and requires code review, making d5 appropriate rather than d3 (no default linter covers this) or d7 (tools do exist for the clone case).

e5 Effort Remediation debt — work required to fix once spotted

Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix says to extract on third occurrence using shared base classes, traits, or services. This is not a one-line patch; it involves identifying the pattern, creating an abstraction (class, trait, or service), updating all call sites, and verifying correctness. If duplication is spread across multiple files or modules — which is the common case — this is a multi-file refactor, grounding it at e5.

b5 Burden Structural debt — long-term weight of choosing wrong

Closest to 'persistent productivity tax' (b5). Code duplication applies broadly across web, cli, and queue-worker contexts. Unaddressed duplication means every bug fix or feature change may need to be applied in multiple places, slowing many work streams. However, it doesn't necessarily define the system's shape (b7+), so b5 reflects a persistent but not architectural burden.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap — contradicts how a similar concept works elsewhere' (t7). The misconception is explicitly stated: developers believe all duplication must be eliminated immediately, but premature abstraction creates coupling that is harder to undo than the original duplication. The common mistakes confirm this — extracting code that looks similar but has different reasons to change produces a wrong abstraction. This contradicts the widely taught DRY principle, making the 'obvious' action (extract immediately) frequently the wrong one, warranting t7.

About DEBT scoring →

TL;DR

Not all duplication is equal — clone duplication (copy-paste) always warrants extraction, but semantic duplication (similar logic, different context) may be acceptable.

Explanation

Clone duplication: identical or near-identical code — always extract. Semantic duplication: two code sections do similar things with different data or context — requires judgement. Rule of Three: tolerate one duplicate, refactor on the third occurrence. Types: Type 1 (exact clone), Type 2 (renamed variables), Type 3 (modified statements), Type 4 (same algorithm, different structure). Tools: phpcpd (PHP Copy/Paste Detector), jscpd, SonarQube. The wrong abstraction is worse than duplication — don't prematurely abstract code that merely looks similar but has diverging requirements.

Common Misconception

All duplication must be eliminated immediately — premature abstraction creates worse problems than duplication. Wait for the third occurrence before extracting.

Why It Matters

Clone duplication creates maintenance traps — fix a bug in one copy and forget the other. But wrong abstractions create coupling that's harder to undo than the original duplication.

Common Mistakes

  • Extracting two pieces of code that look similar but have different reasons to change — they'll diverge and the shared abstraction becomes a problem.
  • Not using phpcpd/jscpd in CI to detect growing duplication.
  • Applying Rule of Three too rigidly — sometimes the second occurrence warrants extraction if the pattern is clearly stable.

Code Examples

✗ Vulnerable
// Same validation logic in 3 controllers — copy-paste:
if (strlen($name) < 2 || strlen($name) > 50) {
    throw new ValidationException('Invalid name');
}
✓ Fixed
// Extracted to validator on third occurrence:
class NameValidator {
    public function validate(string $name): void {
        if (strlen($name) < 2 || strlen($name) > 50) {
            throw new ValidationException('Invalid name');
        }
    }
}

Added 23 Mar 2026
Views 50
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 1 ping W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 1 ping W 2 pings T 1 ping F 0 pings S 2 pings S 0 pings M 0 pings T 1 ping W 0 pings T 1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 1 ping W
Bing 1
No pings yesterday
Amazonbot 10 Google 6 Unknown AI 4 Perplexity 4 Scrapy 4 ChatGPT 3 Ahrefs 3 Claude 2 Bing 2 SEMrush 2 Meta AI 1
crawler 36 crawler_json 3 pre-tracking 2
DEV INTEL Tools & Severity
🟡 Medium ⚙ Fix effort: Medium
⚡ Quick Fix
Run phpcpd in CI. Extract on third occurrence. Use shared base classes, traits, or services for validated duplicate patterns. Be cautious — not all similar code should be unified.
📦 Applies To
web cli queue-worker
🔗 Prerequisites
🔍 Detection Hints
phpcpd|jscpd
Auto-detectable: ✓ Yes phpcpd jscpd sonarqube
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: High ✗ Manual fix Fix: High Context: File Tests: Update


✓ schema.org compliant