Path Normalisation Bypass
Also Known As
path canonicalization
URL normalization
path canonicalisation
TL;DR
Using ../, URL encoding (%2f), or OS-specific separators to escape intended directory boundaries and access files outside an allowlisted path.
Explanation
Path normalisation attacks exploit the gap between how an application validates a path and how the OS resolves it. Common techniques: directory traversal (../../etc/passwd), URL-encoded separators (%2F, %5C on Windows), double encoding (%252F), null bytes (file.php%00.jpg in older PHP), and Windows UNC paths. PHP's realpath() resolves symlinks and traversal sequences to a canonical absolute path — always use it to validate that the resolved path starts with the intended base directory. Use basename() when you only need the filename component. Never construct file paths by concatenating user input directly, even after filtering — a filter on ../ is bypassable; a realpath() prefix check is not.
How It's Exploited
GET /download?file=../../etc/passwd
GET /download?file=..%2F..%2Fetc%2Fpasswd # URL-encoded
GET /download?file=....//....//etc/passwd # doubled-dot bypass
GET /download?file=..%2F..%2Fetc%2Fpasswd # URL-encoded
GET /download?file=....//....//etc/passwd # doubled-dot bypass
Common Misconception
✗ Checking whether a path contains ../ is sufficient to prevent traversal. Encoded variants (%2e%2e%2f), double encoding, and OS-specific separators survive naive string checks. Always resolve the full canonical path with realpath() and verify it starts with the allowed base directory.
Why It Matters
Comparing or restricting paths before normalisation allows bypass via sequences like /var/www/../../etc/passwd that look different but resolve identically.
Common Mistakes
- Checking if a path starts with an allowed prefix before calling realpath() — the check is against the un-normalised string.
- Filtering ../ sequences with str_replace() but missing URL-encoded variants %2e%2e%2f.
- Not verifying that realpath() output is still within the intended base directory after resolution.
- Using basename() for security — it strips the path but the remaining filename may still be dangerous.
Code Examples
✗ Vulnerable
$file = $_GET['file'];
readfile('/var/www/files/' . $file); // ../../etc/passwd works
✓ Fixed
$base = realpath('/var/www/files');
$path = realpath($base . '/' . $_GET['file']);
if ($path === false || !str_starts_with($path, $base)) {
abort(403);
}
readfile($path);
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
15 Mar 2026
Edited
22 Mar 2026
Views
23
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 0
No pings yet today
Amazonbot 7
Unknown AI 3
Perplexity 2
SEMrush 2
ChatGPT 2
Majestic 1
Google 1
Ahrefs 1
Also referenced
How they use it
crawler 17
crawler_json 1
pre-tracking 1
Related categories
⚡
DEV INTEL
Tools & Severity
🟠 High
⚙ Fix effort: Low
⚡ Quick Fix
After resolving with realpath(), verify the result strictly starts with your base directory using str_starts_with($resolved, $baseDir.'/') — the trailing slash prevents prefix attacks
📦 Applies To
PHP 5.0+
web
cli
🔗 Prerequisites
🔍 Detection Hints
Manual ../ stripping instead of realpath() canonicalisation; no str_starts_with base dir check after realpath
Auto-detectable:
✓ Yes
semgrep
psalm
⚠ Related Problems
🤖 AI Agent
Confidence: High
False Positives: Medium
✓ Auto-fixable
Fix: Low
Context: Line
CWE-22
CWE-23