← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

htmlspecialchars()

PHP PHP 5.0+ Beginner
debt(d5/e1/b3/t5)
d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). The detection_hints list semgrep, psalm, and phpstan — all specialist SAST/static-analysis tools. The code_pattern explicitly identifies missing ENT_QUOTES and missing charset as detectable patterns, but these are not caught by a default linter or compiler; they require dedicated static analysis configuration.

e1 Effort Remediation debt — work required to fix once spotted

Closest to 'one-line patch or single-call swap' (e1). The quick_fix is literally a single-call replacement: swap htmlspecialchars($var) with htmlspecialchars($var, ENT_QUOTES | ENT_HTML5, 'UTF-8'). Each misuse site is an independent one-line fix.

b3 Burden Structural debt — long-term weight of choosing wrong

Closest to 'localised tax' (b3). The concept applies to web PHP contexts only (applies_to: web) and each call site is independent. While it must be applied consistently across all output points, it does not impose cross-cutting architectural weight — it is a per-output-call discipline rather than a structural commitment that shapes the codebase.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap — a documented gotcha most devs eventually learn' (t5). The misconception field directly states the trap: calling htmlspecialchars() without ENT_QUOTES feels complete but leaves single-quoted attributes exploitable. This is a documented, well-known gotcha that many PHP developers encounter and learn the hard way, but it does not fully contradict behaviour from another ecosystem — it is an under-specification trap rather than a contradiction.

About DEBT scoring →

Also Known As

htmlspecialchars() HTML escaping PHP XSS output encoding

TL;DR

Converts HTML special characters to entities — the primary defence against XSS in HTML output contexts.

Explanation

htmlspecialchars($string, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') converts <, >, &, ", and ' to their HTML entity equivalents, preventing injected text from being interpreted as HTML or JavaScript. ENT_QUOTES encodes both single and double quotes. ENT_SUBSTITUTE (PHP 8.1+) replaces invalid UTF-8 sequences with a replacement character instead of returning an empty string. Always specify the charset explicitly. This function is for HTML body and attribute contexts only — different escaping is needed for JavaScript, CSS, and URLs.

Common Misconception

htmlspecialchars() with no flags is safe for all HTML contexts. Without ENT_QUOTES, single quotes are not escaped — an attacker can break out of single-quoted HTML attributes. Always use htmlspecialchars($val, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8').

Why It Matters

htmlspecialchars() converts the five HTML special characters to entities — it is the primary defence against reflected XSS when outputting user-controlled data into HTML context.

Common Mistakes

  • Forgetting the ENT_QUOTES flag — without it, single quotes are not escaped, enabling injection in single-quoted attributes.
  • Not specifying the charset — defaults to latin-1 in older PHP, which can be bypassed with multi-byte characters.
  • Using htmlspecialchars() in non-HTML contexts (JavaScript, CSS, URLs) — each context requires different escaping.
  • Using strip_tags() instead — it removes tags but attribute-based XSS (onerror=) survives in allowed tags.

Code Examples

✗ Vulnerable
echo '<p>' . $userInput . '</p>'; // XSS if input contains <script>
✓ Fixed
// Always specify ENT_QUOTES and charset
echo '<p>' . htmlspecialchars($userInput, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8') . '</p>';

// Helper function — use everywhere user data touches HTML
function e(string $s): string {
    return htmlspecialchars($s, ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8');
}

echo '<input value="' . e($_GET['q']) . '">';  // safe
echo '<a href="' . e($url) . '">' . e($label) . '</a>'; // safe

// htmlspecialchars_decode() reverses it — use only for internal data, never user input

Added 15 Mar 2026
Edited 22 Mar 2026
Views 68
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 1 ping W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T 4 pings F 1 ping S 0 pings S 1 ping M 2 pings T 0 pings W 0 pings T 0 pings F 1 ping S 0 pings S 0 pings M 1 ping T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 3 pings M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Amazonbot 10 Perplexity 7 Ahrefs 6 Google 5 Scrapy 5 SEMrush 4 Claude 2 Bing 2 ChatGPT 2 PetalBot 2 Majestic 1 Meta AI 1 Qwen 1 Sogou 1
crawler 43 crawler_json 6
DEV INTEL Tools & Severity
🟠 High ⚙ Fix effort: Low
⚡ Quick Fix
Always use htmlspecialchars($var, ENT_QUOTES | ENT_HTML5, 'UTF-8') — ENT_QUOTES escapes both single and double quotes, and the charset prevents UTF-8 encoding attacks
📦 Applies To
PHP 5.0+ web
🔗 Prerequisites
🔍 Detection Hints
echo $var without htmlspecialchars; htmlspecialchars without ENT_QUOTES; htmlspecialchars without charset parameter
Auto-detectable: ✓ Yes semgrep psalm phpstan
⚠ Related Problems
🤖 AI Agent
Confidence: Medium False Positives: Medium ✓ Auto-fixable Fix: Low Context: Line
CWE-79


✓ schema.org compliant