Regex in Loop
Also Known As
regex performance
preg_match loop
compiled regex
TL;DR
Compiling and executing the same regular expression on every iteration of a loop — hoist the pattern outside.
Explanation
PHP compiles a regex pattern on each call to preg_match()/preg_replace() unless the JIT cache has it. Running the same pattern thousands of times per request wastes compilation overhead and can cause backtracking on complex patterns. The fix is to define the pattern as a constant or variable before the loop, then reference it inside. For simple membership tests on many strings, preg_grep() on the whole array is often faster than a per-element loop.
Common Misconception
✗ "Regex patterns must be extracted outside loops to avoid recompilation." — PHP automatically caches compiled regexes in its PCRE cache (pcre.jit_compilation), so the same pattern string reuses the compiled form. The real performance risks are: (1) dynamically building pattern strings that differ each iteration, defeating the cache; (2) catastrophic backtracking on complex patterns with adversarial input (ReDoS).
Why It Matters
Calling preg_match() or preg_replace() inside a loop recompiles the regex pattern on every iteration — move the pattern outside or use compiled approaches for hot paths.
Common Mistakes
- Building regex patterns dynamically inside a loop — string concatenation plus compilation on every iteration.
- Using regex for simple string operations that strpos() or str_contains() handle faster.
- Not knowing that PHP caches compiled regexes in a PCRE cache — but the cache has a limited size and can be evicted.
- Applying complex regexes to unbounded user input without length limits — potential ReDoS.
Code Examples
✗ Vulnerable
// Pattern compiled on every iteration
foreach ($emails as $email) {
if (preg_match('/^[\w.+-]+@[\w-]+\.[\w.]+$/', $email)) {
$valid[] = $email;
}
}
✓ Fixed
// PHP compiles and caches regex internally after first use in the same request,
// but pulling the constant pattern out makes intent clear and
// avoids accidental recompilation if the string is built dynamically
const EMAIL_PATTERN = '/^[\w.+-]+@[\w-]+\.[\w.]+$/';
$valid = array_filter($emails, fn($e) => preg_match(EMAIL_PATTERN, $e));
// For truly hot loops, pre-validate with filter_var (faster than regex)
$valid = array_filter($emails, fn($e) => filter_var($e, FILTER_VALIDATE_EMAIL));
References
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
15 Mar 2026
Edited
28 Apr 2026
Views
29
AI edit
PF Media Bot
Claude Opus 4.5 on misconception · 28 Apr 2026
Edits history 1 edit
- misconception PF Media Bot Claude Opus 4.5 · 28 Apr 2026
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 0
No pings yet today
No pings yesterday
Amazonbot 8
Perplexity 5
Ahrefs 2
SEMrush 2
Google 2
Unknown AI 1
Also referenced
How they use it
crawler 19
crawler_json 1
Related categories
⚡
DEV INTEL
Tools & Severity
🟠 High
⚙ Fix effort: Low
⚡ Quick Fix
Extract compiled regex patterns to class constants — PHP caches compiled PCRE patterns but re-compiling the same pattern millions of times in a loop wastes CPU; use preg_match_all() for batch matching
📦 Applies To
PHP 5.0+
web
cli
queue-worker
🔗 Prerequisites
🔍 Detection Hints
preg_match() called in tight loop with same pattern; PCRE pattern as string literal inside foreach; regex compilation overhead in batch processing
Auto-detectable:
✓ Yes
blackfire
xdebug
⚠ Related Problems
🤖 AI Agent
Confidence: High
False Positives: Low
✓ Auto-fixable
Fix: Low
Context: Function