filter_var()
debt(d5/e3/b3/t7)
Closest to 'specialist tool catches it' (d5). The detection_hints list phpstan and semgrep — both specialist static analysis tools — as the means to catch misuse patterns like manual regex email validation or treating sanitised output as validated. The misuse is not caught by the compiler or a default linter, requiring deliberate tooling setup.
Closest to 'simple parameterised fix' (e3). The quick_fix indicates replacing misuse with the appropriate filter_var($input, FILTER_VALIDATE_*) call, possibly combined with separate sanitisation for output context. This is a small, localised change — replacing a pattern within one component — not a single one-liner swap because multiple call sites and the sanitisation/validation distinction must be addressed.
Closest to 'localised tax' (b3). The choice applies to web and CLI contexts in PHP, but misuse of filter_var is localised to input-handling code rather than being load-bearing across the entire codebase. Each misuse is independent and doesn't create a gravitational pull on unrelated components.
Closest to 'serious trap' (t7). The misconception field explicitly states that FILTER_VALIDATE_EMAIL is widely believed to confirm deliverability when it only validates RFC 5321 syntax format. Additionally, common_mistakes reveal that FILTER_SANITIZE_* is confused with validation, and FILTER_VALIDATE_URL silently accepts dangerous schemes like javascript: and data: — multiple serious contradictions between what the function names imply and what they actually do, scoring near 't7' for contradicting reasonable developer expectations about similarly-named concepts.
Also Known As
TL;DR
Explanation
filter_var($value, FILTER_VALIDATE_*) validates and optionally sanitises input against a wide range of types. FILTER_VALIDATE_URL checks URL structure; FILTER_VALIDATE_EMAIL checks email format; FILTER_VALIDATE_IP validates IP addresses. Sanitise filters (FILTER_SANITIZE_*) remove or encode unwanted characters. Note that FILTER_VALIDATE_URL accepts javascript: and data: URIs — additional checks are needed when the URL will be used in a redirect or src attribute.
Common Misconception
Why It Matters
Common Mistakes
- Using FILTER_SANITIZE_* and treating the output as validated input — sanitisation removes characters, it does not validate semantics.
- Using FILTER_VALIDATE_EMAIL and treating a valid result as deliverable — it validates format, not existence.
- Not passing flags to FILTER_VALIDATE_INT to restrict range — validates as integer but allows negative or huge values.
- Using filter_var for URL validation in security contexts — it accepts javascript: and data: URLs which are dangerous.
Code Examples
// Sanitise then use without validation:
$email = filter_var($_POST['email'], FILTER_SANITIZE_EMAIL);
sendEmail($email); // Sanitised but may still not be a valid address
// Validate then use:
if (!filter_var($email, FILTER_VALIDATE_EMAIL)) throw new InvalidArgumentException('...');
// Validate
$email = filter_var($_POST['email'], FILTER_VALIDATE_EMAIL);
if ($email === false) { throw new \InvalidArgumentException('Invalid email'); }
$url = filter_var($_POST['url'], FILTER_VALIDATE_URL);
$ip = filter_var($_SERVER['REMOTE_ADDR'], FILTER_VALIDATE_IP, FILTER_FLAG_IPV4);
$int = filter_var($_GET['page'], FILTER_VALIDATE_INT, ['options' => ['min_range' => 1]]);
// Sanitise (removes dangerous chars — less reliable than allow-listing)
$clean = filter_var($input, FILTER_SANITIZE_SPECIAL_CHARS);
// filter_input reads from superglobals safely
$page = filter_input(INPUT_GET, 'page', FILTER_VALIDATE_INT) ?? 1;