Entropy
debt(d5/e3/b3/t7)
Closest to 'specialist tool catches it' (d5). The detection_hints list semgrep and psalm, and the code_pattern targets specific misuses like mt_srand(time()) or rand() for cryptographic material — these are not caught by default linting but require configured SAST rules or specialist static analysis tooling.
Closest to 'simple parameterised fix' (e3). The quick_fix is a direct swap — replace mt_rand/time-based seeds with random_bytes(32). It's more than a one-line patch because all call sites generating tokens/secrets must be located and replaced, but it doesn't span architectural boundaries.
Closest to 'localised tax' (b3). The concept applies to web and CLI contexts where security-sensitive token generation occurs, but it's a bounded concern — only code paths that generate secrets, tokens, or passwords are affected. The rest of the codebase is unaffected.
Closest to 'serious trap' (t7). The misconception field directly states the canonical wrong belief: 'a long password always has high entropy.' This contradicts intuitive reasoning (length = strength) in a way that is security-critical. Similarly, common mistakes show that developers measure entropy in characters rather than bits, and that URL-safe encoding silently reduces entropy per character — multiple compounding misconceptions that contradict how developers reason about related concepts like string length and randomness.
Also Known As
TL;DR
Explanation
In the context of security tokens, entropy is the number of possible values an attacker would have to guess. bin2hex(random_bytes(32)) has 256 bits of entropy — 2²⁵⁶ possible values. md5(time()) has about 17 bits (86,400 seconds per day, rounded to power of 2) — trivially enumerable. More bits of entropy means exponentially more guesses required. For security tokens, 128+ bits of entropy from a CSPRNG is the minimum recommended today.
Common Misconception
Why It Matters
Common Mistakes
- Using time-based seeds for token generation — timestamps are guessable within a small window.
- Short tokens that reduce entropy even with a CSPRNG — 4 hex chars = 16 bits = 65,536 possibilities.
- Not understanding that URL-safe encoding reduces entropy per character — use enough raw bytes.
- Measuring entropy in characters rather than bits — 16 hex chars = only 64 bits of entropy.
Code Examples
// Low entropy token — guessable:
$token = dechex(time()); // Timestamp in hex — attacker knows approximate value
$token = substr(md5(rand()), 0, 8); // 8 hex chars = 32 bits — brute-forceable
// High entropy:
$token = bin2hex(random_bytes(32)); // 256 bits — not feasibly guessable
// High entropy = unpredictable; low entropy = guessable
// Low entropy — bad for secrets
\$token = mt_rand(); // 32-bit, predictable seed
\$token = time() . rand(); // ~40 bits, still predictable
// High entropy — cryptographically secure
\$token = random_bytes(32); // 256 bits from OS CSPRNG (best)
\$hex = bin2hex(\$token); // 64-character hex string
// Measure password strength by entropy:
// 8-char lowercase only: 37 bits — weak
// 12-char mixed case+digits+symbols: 78 bits — strong
// 4-word passphrase (diceware): ~51 bits — usable
// Entropy in UUIDs:
// UUID v4: 122 bits — suitable for public IDs
// UUID v7: 74 bits random + time-ordered — good for DB primary keys