← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

PCRE in PHP

Regex PHP 5.0+ Intermediate
debt(d7/e3/b3/t7)
d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7), because while phpstan/psalm can flag some issues, the core trap (treating preg_match 0 and false the same) typically slips past default static analysis and surfaces only in review or when malformed patterns hit production.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3), since the fix is replacing `if (!preg_match(...))` with `=== false` checks and adding flags like /u or preg_quote() calls — small pattern-level changes per call site.

b3 Burden Structural debt — long-term weight of choosing wrong

Closest to 'localised tax' (b3), as regex usage tends to be scattered but each usage is self-contained; the choice doesn't shape system architecture but does impose a recurring small tax wherever regex is used across web/cli/queue contexts.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7), because preg_match's tri-state return (1/0/false) directly contradicts the naming convention of typical boolean-returning match functions in other languages, exactly as the misconception field describes.

About DEBT scoring →

Also Known As

preg_match preg_replace preg_split PHP regex functions

TL;DR

preg_match, preg_match_all, preg_replace, preg_split — and checking === false to distinguish errors from no-match.

Explanation

PHP regex functions: preg_match() returns 1/0/false. preg_match_all() returns count or false. preg_replace() returns string or false. preg_replace_callback() for callback replacement. preg_split() splits by pattern. preg_grep() filters arrays. preg_quote() escapes metacharacters for literal matching. All return false on error — use preg_last_error_msg(). Use $1 not \1 in replacement strings. PCRE caches compiled patterns.

Common Misconception

preg_match returns true/false — preg_match returns 1 (match found), 0 (no match), or false (error in pattern); always use === false to distinguish errors from no-match.

Why It Matters

Using if (!preg_match()) conflates no-match (0) with error (false) — pattern errors silently appear as no-match without strict === false checking.

Common Mistakes

  • if (!preg_match()) treating error same as no-match — use === false
  • Not using preg_quote() for user-supplied literal strings
  • \1 instead of $1 in preg_replace replacement — use $1
  • Building regex from unescaped user input — regex injection

Code Examples

✗ Vulnerable
// Silent error — pattern error looks like no-match:
$result = preg_match('/(?P<n>[a-z]+/i', $subject); // Missing )
if (!$result) {
    echo 'No match'; // Actually: broken pattern!
}

// User input without escaping — regex injection:
$search = $_GET['q']; // User enters: a+b
preg_match("/{$search}/", $text); // + is a metachar!
✓ Fixed
// Detect errors explicitly:
$result = preg_match('/^[a-z]+$/', $subject);
if ($result === false) {
    throw new RuntimeException('Regex error: ' . preg_last_error_msg());
}
if ($result === 0) { /* no match */ }

// Safe user input as literal:
$escaped = preg_quote($_GET['q'], '/');
$found   = preg_match("/{$escaped}/i", $text); // Metacharacters escaped

Tags


Added 16 Mar 2026
Edited 22 Mar 2026
Views 61
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
1 ping T 0 pings W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 1 ping T 2 pings W 2 pings T 4 pings F 2 pings S 9 pings S 0 pings M 0 pings T 1 ping W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 1 ping T 0 pings W 0 pings T 1 ping F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Scrapy 18 Amazonbot 9 Perplexity 5 Ahrefs 4 SEMrush 3 Unknown AI 2 Google 2 Bing 2 ChatGPT 2 Claude 1 Meta AI 1 PetalBot 1
crawler 47 crawler_json 3
DEV INTEL Tools & Severity
🟡 Medium ⚙ Fix effort: Medium
⚡ Quick Fix
Use named captures (?P<year>\d{4}) for readable matches; always add the u flag for Unicode strings; test with preg_last_error() to detect PCRE failures silently returning false
📦 Applies To
PHP 5.0+ web cli queue-worker
🔗 Prerequisites
🔍 Detection Hints
preg_match without checking return value for false; regex without /u flag on Unicode input; no PCRE backtrack limit resulting in silent failure
Auto-detectable: ✓ Yes phpstan psalm semgrep
⚠ Related Problems
🤖 AI Agent
Confidence: Medium False Positives: Medium ✗ Manual fix Fix: Medium Context: Function Tests: Update
CWE-400


✓ schema.org compliant