XML External Entity (XXE)
debt(d5/e3/b3/t7)
Closest to 'specialist tool catches it' (d5). The detection_hints list semgrep and psalm as tools, both specialist SAST tools that catch the pattern of simplexml_load_string or DOMDocument->loadXML without proper entity disabling. This is not caught by the compiler or a default linter, but is reachable by deliberate SAST integration.
Closest to 'simple parameterised fix' (e3). The quick_fix is 'Set libxml_disable_entity_loader(true) and LIBXML_NOENT off; use JSON instead of XML where possible'. While the core fix is a one-liner, the common_mistakes note that every XML parsing site across the codebase needs the same treatment — REST endpoints, SimpleXML calls, DOMDocument calls — making it a small but recurring pattern-replacement fix rather than a single-line patch.
Closest to 'localised tax' (b3). The applies_to scope is web and CLI PHP contexts, but the burden is limited to XML parsing sites. It imposes a configuration tax wherever XML is parsed, but it does not shape the broader architecture. Teams that adopt a centralised XML parsing wrapper reduce this further, so it stays at b3.
Closest to 'serious trap' (t7). The misconception field explicitly states that disabling DOCTYPE declarations is believed to fully prevent XXE but does not — parameter entities can bypass this, and many XML libraries enable external entities by default. A common_mistake compounds this: LIBXML_NOENT is widely misread as 'no entities' when it actually substitutes them. These two counter-intuitive behaviours together mean a competent developer following apparent best practice can still be vulnerable, warranting t7.
Also Known As
TL;DR
Explanation
XXE exploits XML parsers that resolve external entity declarations — specially crafted XML that references system files (<!ENTITY xxe SYSTEM "file:///etc/passwd">) or internal network resources. Consequences include arbitrary file read, SSRF, and in some configurations denial of service via billion-laughs recursive entity expansion. In PHP, disable external entity loading: libxml_disable_entity_loader(true) before parsing, or use SimpleXML with LIBXML_NONET.
How It's Exploited
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<data>&xxe;</data>
# Server returns /etc/passwd contents in the response
Common Misconception
Why It Matters
Common Mistakes
- Not disabling external entity loading before parsing XML: libxml_disable_entity_loader(true) is required.
- Using SimpleXML or DOMDocument::loadXML() on untrusted input without suppressing entity processing.
- Accepting XML in API endpoints — REST endpoints that accept application/xml need the same protections.
- Thinking that LIBXML_NOENT alone is sufficient — it actually substitutes entities rather than disabling them.
Avoid When
- Never parse user-supplied XML with external entity loading enabled — even internal XML from APIs can be compromised.
- Do not use SimpleXML on untrusted input without disabling external entities first.
When To Use
- Disable external entity loading whenever parsing XML from any untrusted source.
- Use libxml_disable_entity_loader(true) (PHP < 8.0) or LIBXML_NOENT flag awareness on all XML parsing.
Code Examples
// Loads external entities — file disclosure or SSRF
$xml = simplexml_load_string($userInput);
// Disable external entity loading before parsing
libxml_disable_entity_loader(true); // PHP < 8.0
// PHP 8.0+ disables it by default
$xml = new DOMDocument();
$xml->loadXML($userInput, LIBXML_NONET | LIBXML_NOENT);
// Or use SimpleXML with the same flags
$xml = simplexml_load_string($userInput, 'SimpleXMLElement', LIBXML_NONET | LIBXML_NOENT);