Regex Flags & Modifiers
debt(d6/e1/b2/t6)
Closest to 'only careful code review or runtime testing' (d7), slightly better at d6 because phpstan/phpcs (per detection_hints) can flag some missing-flag patterns, but most missing /u or /s issues only surface when non-ASCII or multiline input arrives in production.
Closest to 'one-line patch or single-call swap' (e1), since quick_fix is literally adding a flag character (u, s, or x) to the regex delimiter — a single character edit per occurrence.
Closest to 'localised tax' (b3), slightly lighter at b2 because regex flags are per-call decisions; they don't impose ongoing structural weight, though consistency across many regex sites is a mild recurring concern.
Closest to 'serious trap' (t7), slightly softer at t6 because the misconception (thinking /i is the only common flag) and common_mistakes (m flag only affecting anchors, spaces being significant in [] under /x, \w missing accented chars without /u) contradict naive expectations in ways that silently corrupt multibyte text.
Also Known As
TL;DR
Explanation
PHP/PCRE modifiers: i — case-insensitive matching. m — multiline: ^ and $ match line boundaries not just string start/end. s — dotall: . matches newlines (default: . does not match \n). x — extended: allows whitespace and # comments in the pattern for readability. u — UTF-8: treat pattern and subject as UTF-8; enables \p{L} Unicode property escapes. A — anchored. D — dollar matches end only. U — ungreedy. Inline mode: (?ims) applies from that point; (?-i) turns off.
Common Misconception
Why It Matters
Common Mistakes
- Forgetting s flag for multi-line strings — . does not match newlines without s
- Not using u flag for UTF-8 input — \w misses accented characters
- x flag with unescaped spaces in character classes — spaces inside [] are significant
- m flag without ^ or $ anchors — m only affects those anchors
Code Examples
// Missing flags causing bugs:
$html = "<div>\nContent\n</div>";
preg_match('/<div>(.*)<\/div>/', $html, $m);
// Without s: . does not match \n — no match!
$name = 'François';
preg_match('/^\w+$/', $name, $m);
// Without u: ç not in \w — fails for valid French name!
// Correct flags for each use case:
preg_match('/<div>(.*?)<\/div>/s', $html, $m); // s: . matches newlines
preg_match('/^[\w]+$/u', 'François', $m); // u: Unicode \w
// x flag for readable complex pattern:
$phone = '/
^ # Start
(\+?\d{1,3})? # Optional country code
[\s.-]? # Optional separator
\(?\d{3}\)? # Area code
[\s.-]? # Separator
\d{3}[\s.-]?\d{4} # Number
$ # End
/x';
preg_match($phone, '+1 (555) 123-4567');