Path Traversal
debt(d5/e3/b3/t7)
Closest to 'specialist tool catches it' (d5). The detection_hints list semgrep and psalm — both specialist SAST tools that can identify unsanitised user input flowing into file functions. A default linter won't catch it, and it won't surface as a compile error, but targeted static analysis rules (e.g. semgrep patterns matching fopen/readfile with $_GET/$_POST) can flag the common case. Not quite d7 because automated detection is explicitly marked 'yes' in the metadata.
Closest to 'simple parameterised fix' (e3). The quick_fix is a small, focused change: pipe user input through realpath() and compare against the allowed base directory with str_starts_with(). This is more than a one-line swap because each vulnerable call site must also have the validation wrapper added and the base path defined, but it stays within a single component or utility function per file — no cross-cutting architectural rework needed.
Closest to 'localised tax' (b3). The applies_to scope covers web and cli contexts in PHP, but the fix is a localised validation pattern at file-access call sites. Once a safe file-access helper is extracted, the rest of the codebase is largely unaffected. It does not impose a strong gravitational pull on unrelated components.
Closest to 'serious trap — contradicts how a similar concept works elsewhere' (t7). The misconception field directly describes the trap: developers believe naive string-stripping (str_replace('../', '')) prevents path traversal, but attackers use encoded variants (..%2F, %2e%2e%2f, ....//), unicode sequences, and double-encoding that survive the replacement. This is a well-documented gotcha that contradicts the intuition that 'remove the bad string = safe', making it a serious cognitive trap one step below catastrophic.
Also Known As
TL;DR
Explanation
Path traversal (also called directory traversal) lets attackers read or include arbitrary files on the server by supplying sequences like ../../etc/passwd in a parameter used to build a file path. In PHP, include/require with user-supplied filenames is the classic vector. Mitigation requires resolving the real path with realpath() and asserting it starts with the intended base directory — whitelist known-good filenames where possible.
How It's Exploited
# Reads /etc/passwd if not validated
Diagram
flowchart TD
INPUT2[User input: ../../../etc/passwd] --> APP_READ[App reads<br/>uploads . . / . . / . . /etc/passwd]
APP_READ --> SECRET[Reads /etc/passwd!]
subgraph Fix2
REALPATH2[realpath resolves all .. and symlinks]
BASEDIR[Check resolved path starts with allowed dir]
REALPATH2 --> BASEDIR
BASEDIR -->|outside allowed| REJECT3[Reject - path traversal]
BASEDIR -->|inside allowed| ALLOW2[Allow access]
end
style APP_READ fill:#f85149,color:#fff
style SECRET fill:#f85149,color:#fff
style REJECT3 fill:#238636,color:#fff
Common Misconception
Why It Matters
Common Mistakes
- Passing user-supplied filenames to file_get_contents(), readfile(), or fopen() without normalisation.
- Filtering traversal sequences with str_replace('../', '') — bypassed with ....// which collapses to ../.
- Not verifying that the realpath() of the requested file is within the intended directory.
- Allowing the full filesystem path to be specified via a 'file' or 'path' query parameter.
Code Examples
$file = $_GET['file'];
readfile('/var/www/uploads/' . $file);
$base = realpath('/var/www/uploads');
$requested = realpath($base . '/' . $_GET['file']);
if ($requested === false || !str_starts_with($requested, $base . DIRECTORY_SEPARATOR)) {
http_response_code(403); exit;
}
readfile($requested);