← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Path Normalisation Bypass

security CWE-22 OWASP A1:2021 CVSS 7.5 PHP 5.0+ Intermediate

Also Known As

path canonicalization URL normalization path canonicalisation

TL;DR

Using ../, URL encoding (%2f), or OS-specific separators to escape intended directory boundaries and access files outside an allowlisted path.

Explanation

Path normalisation attacks exploit the gap between how an application validates a path and how the OS resolves it. Common techniques: directory traversal (../../etc/passwd), URL-encoded separators (%2F, %5C on Windows), double encoding (%252F), null bytes (file.php%00.jpg in older PHP), and Windows UNC paths. PHP's realpath() resolves symlinks and traversal sequences to a canonical absolute path — always use it to validate that the resolved path starts with the intended base directory. Use basename() when you only need the filename component. Never construct file paths by concatenating user input directly, even after filtering — a filter on ../ is bypassable; a realpath() prefix check is not.

How It's Exploited

GET /download?file=../../etc/passwd
GET /download?file=..%2F..%2Fetc%2Fpasswd # URL-encoded
GET /download?file=....//....//etc/passwd # doubled-dot bypass

Common Misconception

Checking whether a path contains ../ is sufficient to prevent traversal. Encoded variants (%2e%2e%2f), double encoding, and OS-specific separators survive naive string checks. Always resolve the full canonical path with realpath() and verify it starts with the allowed base directory.

Why It Matters

Comparing or restricting paths before normalisation allows bypass via sequences like /var/www/../../etc/passwd that look different but resolve identically.

Common Mistakes

  • Checking if a path starts with an allowed prefix before calling realpath() — the check is against the un-normalised string.
  • Filtering ../ sequences with str_replace() but missing URL-encoded variants %2e%2e%2f.
  • Not verifying that realpath() output is still within the intended base directory after resolution.
  • Using basename() for security — it strips the path but the remaining filename may still be dangerous.

Code Examples

✗ Vulnerable
$file = $_GET['file'];
readfile('/var/www/files/' . $file); // ../../etc/passwd works
✓ Fixed
$base = realpath('/var/www/files');
$path = realpath($base . '/' . $_GET['file']);
if ($path === false || !str_starts_with($path, $base)) {
    abort(403);
}
readfile($path);

Added 15 Mar 2026
Edited 22 Mar 2026
Views 23
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings F 1 ping S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 1 ping F 0 pings S 1 ping S 0 pings M 0 pings T 1 ping W 2 pings T 1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 2 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 1 ping F 0 pings S
No pings yet today
Amazonbot 7 Unknown AI 3 Perplexity 2 SEMrush 2 ChatGPT 2 Majestic 1 Google 1 Ahrefs 1
crawler 17 crawler_json 1 pre-tracking 1
DEV INTEL Tools & Severity
🟠 High ⚙ Fix effort: Low
⚡ Quick Fix
After resolving with realpath(), verify the result strictly starts with your base directory using str_starts_with($resolved, $baseDir.'/') — the trailing slash prevents prefix attacks
📦 Applies To
PHP 5.0+ web cli
🔗 Prerequisites
🔍 Detection Hints
Manual ../ stripping instead of realpath() canonicalisation; no str_starts_with base dir check after realpath
Auto-detectable: ✓ Yes semgrep psalm
⚠ Related Problems
🤖 AI Agent
Confidence: High False Positives: Medium ✓ Auto-fixable Fix: Low Context: Line
CWE-22 CWE-23

✓ schema.org compliant