← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

Regex in Loop

Performance PHP 5.0+ Intermediate
debt(d5/e3/b3/t7)
d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5) — profilers like Blackfire or Xdebug (from detection_hints.tools) reveal regex hotspots; not visible to linters and silent unless profiled.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3) — quick_fix says extract pattern to class constant or use preg_match_all; small refactor within the affected function/class.

b3 Burden Structural debt — long-term weight of choosing wrong

Closest to 'localised tax' (b3) — applies to hot loops in specific components; doesn't shape system architecture but recurs across web/cli/queue contexts where batch processing happens.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7) — misconception explicitly states devs wrongly believe extraction prevents recompilation, when PHP's PCRE cache already handles that; the real traps (dynamic pattern building defeating cache, ReDoS) contradict the common mental model.

About DEBT scoring →

Also Known As

regex performance preg_match loop compiled regex

TL;DR

Compiling and executing the same regular expression on every iteration of a loop — hoist the pattern outside.

Explanation

PHP compiles a regex pattern on each call to preg_match()/preg_replace() unless the JIT cache has it. Running the same pattern thousands of times per request wastes compilation overhead and can cause backtracking on complex patterns. The fix is to define the pattern as a constant or variable before the loop, then reference it inside. For simple membership tests on many strings, preg_grep() on the whole array is often faster than a per-element loop.

Common Misconception

"Regex patterns must be extracted outside loops to avoid recompilation." — PHP automatically caches compiled regexes in its PCRE cache (pcre.jit_compilation), so the same pattern string reuses the compiled form. The real performance risks are: (1) dynamically building pattern strings that differ each iteration, defeating the cache; (2) catastrophic backtracking on complex patterns with adversarial input (ReDoS).

Why It Matters

Calling preg_match() or preg_replace() inside a loop recompiles the regex pattern on every iteration — move the pattern outside or use compiled approaches for hot paths.

Common Mistakes

  • Building regex patterns dynamically inside a loop — string concatenation plus compilation on every iteration.
  • Using regex for simple string operations that strpos() or str_contains() handle faster.
  • Not knowing that PHP caches compiled regexes in a PCRE cache — but the cache has a limited size and can be evicted.
  • Applying complex regexes to unbounded user input without length limits — potential ReDoS.

Code Examples

✗ Vulnerable
// Pattern compiled on every iteration
foreach ($emails as $email) {
    if (preg_match('/^[\w.+-]+@[\w-]+\.[\w.]+$/', $email)) {
        $valid[] = $email;
    }
}
✓ Fixed
// PHP compiles and caches regex internally after first use in the same request,
// but pulling the constant pattern out makes intent clear and
// avoids accidental recompilation if the string is built dynamically
const EMAIL_PATTERN = '/^[\w.+-]+@[\w-]+\.[\w.]+$/';

$valid = array_filter($emails, fn($e) => preg_match(EMAIL_PATTERN, $e));

// For truly hot loops, pre-validate with filter_var (faster than regex)
$valid = array_filter($emails, fn($e) => filter_var($e, FILTER_VALIDATE_EMAIL));

Added 15 Mar 2026
Edited 28 Apr 2026
Views 56
AI edit PF Media Bot Claude Opus 4.5 on misconception · 28 Apr 2026
Edits history 1 edit
  1. misconception PF Media Bot Claude Opus 4.5 · 28 Apr 2026
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
1 ping T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 1 ping T 1 ping W 2 pings T 2 pings F 3 pings S 1 ping S 0 pings M 2 pings T 1 ping W 0 pings T 0 pings F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 2 pings M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Scrapy 9 Amazonbot 8 SEMrush 6 Perplexity 5 Ahrefs 4 Google 3 Claude 2 ChatGPT 2 PetalBot 2 Unknown AI 1 Bing 1
crawler 40 crawler_json 3
DEV INTEL Tools & Severity
🟠 High ⚙ Fix effort: Low
⚡ Quick Fix
Extract compiled regex patterns to class constants — PHP caches compiled PCRE patterns but re-compiling the same pattern millions of times in a loop wastes CPU; use preg_match_all() for batch matching
📦 Applies To
PHP 5.0+ web cli queue-worker
🔗 Prerequisites
🔍 Detection Hints
preg_match() called in tight loop with same pattern; PCRE pattern as string literal inside foreach; regex compilation overhead in batch processing
Auto-detectable: ✓ Yes blackfire xdebug
⚠ Related Problems
🤖 AI Agent
Confidence: High False Positives: Low ✓ Auto-fixable Fix: Low Context: Function


✓ schema.org compliant