Regex in Loop
debt(d5/e3/b3/t7)
Closest to 'specialist tool catches it' (d5) — profilers like Blackfire or Xdebug (from detection_hints.tools) reveal regex hotspots; not visible to linters and silent unless profiled.
Closest to 'simple parameterised fix' (e3) — quick_fix says extract pattern to class constant or use preg_match_all; small refactor within the affected function/class.
Closest to 'localised tax' (b3) — applies to hot loops in specific components; doesn't shape system architecture but recurs across web/cli/queue contexts where batch processing happens.
Closest to 'serious trap' (t7) — misconception explicitly states devs wrongly believe extraction prevents recompilation, when PHP's PCRE cache already handles that; the real traps (dynamic pattern building defeating cache, ReDoS) contradict the common mental model.
Also Known As
TL;DR
Explanation
PHP compiles a regex pattern on each call to preg_match()/preg_replace() unless the JIT cache has it. Running the same pattern thousands of times per request wastes compilation overhead and can cause backtracking on complex patterns. The fix is to define the pattern as a constant or variable before the loop, then reference it inside. For simple membership tests on many strings, preg_grep() on the whole array is often faster than a per-element loop.
Common Misconception
Why It Matters
Common Mistakes
- Building regex patterns dynamically inside a loop — string concatenation plus compilation on every iteration.
- Using regex for simple string operations that strpos() or str_contains() handle faster.
- Not knowing that PHP caches compiled regexes in a PCRE cache — but the cache has a limited size and can be evicted.
- Applying complex regexes to unbounded user input without length limits — potential ReDoS.
Code Examples
// Pattern compiled on every iteration
foreach ($emails as $email) {
if (preg_match('/^[\w.+-]+@[\w-]+\.[\w.]+$/', $email)) {
$valid[] = $email;
}
}
// PHP compiles and caches regex internally after first use in the same request,
// but pulling the constant pattern out makes intent clear and
// avoids accidental recompilation if the string is built dynamically
const EMAIL_PATTERN = '/^[\w.+-]+@[\w-]+\.[\w.]+$/';
$valid = array_filter($emails, fn($e) => preg_match(EMAIL_PATTERN, $e));
// For truly hot loops, pre-validate with filter_var (faster than regex)
$valid = array_filter($emails, fn($e) => filter_var($e, FILTER_VALIDATE_EMAIL));
Tags
Edits history 1 edit
- misconception PF Media Bot Claude Opus 4.5 · 28 Apr 2026