← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Key String Functions (str_contains, str_starts_with, str_ends_with…)

php PHP 5.0+ Beginner
debt(d5/e3/b5/t7)
d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). The detection_hints list phpstan and phpcs as the tools that catch multibyte misuse patterns (strlen() on user text, strtolower() on international names). These are not default linters but specialist static analysis tools that must be configured and run deliberately — so d5 fits squarely.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3). The quick_fix is a direct function-name swap pattern: replace strlen() with mb_strlen(), strtolower() with mb_strtolower(), substr() with mb_substr(). Each replacement is mechanical but must be applied across multiple call sites in a codebase wherever user-facing text is handled — slightly more than a single one-line patch but well within one component or a search-replace pass.

b5 Burden Structural debt — long-term weight of choosing wrong

Closest to 'persistent productivity tax' (b5). The misconception affects all three contexts listed (web, cli, queue-worker) and the functions are used pervasively throughout any PHP application. Every developer touching string handling must remember the mb_ discipline, and any code review touching user-facing text carries this ongoing cognitive tax. It doesn't reshape architecture (not b7/b9) but it does slow down multiple work streams.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7). The misconception field states the canonical wrong belief explicitly: developers assume built-in str_ functions handle multibyte strings correctly, but strlen('café') returns 5 not 4. This contradicts how string functions work in many other languages where string length means character count. The strpos() returning 0 gotcha (check !== false not truthiness) is an additional well-known trap layered on top, compounding the cognitive debt.

About DEBT scoring →

Also Known As

PHP string functions str_contains str_starts_with mb_string

TL;DR

PHP 8.0 introduced readable string helpers that replace strpos() !== false idioms for common substring checks.

Explanation

PHP 8.0 added str_contains(), str_starts_with(), and str_ends_with() — boolean functions replacing the error-prone strpos($hay, $needle) !== false pattern (which was false for position 0 without the strict check). Other essential string functions include mb_* variants for multibyte safety (mb_strlen, mb_strtolower), sprintf() for formatted output, trim/ltrim/rtrim for whitespace, explode/implode for splitting/joining, and str_replace/preg_replace for substitution. Always use mb_* functions when handling user input that may contain multibyte characters.

Common Misconception

PHP's built-in string functions handle multibyte strings correctly. Most str_ functions operate on bytes, not characters — strlen("café") returns 5 not 4 on UTF-8 input. Always use the mb_ equivalents (mb_strlen, mb_substr) for any user-facing string handling.

Why It Matters

PHP has over 100 string functions with subtle differences in argument order, return values, and multibyte behaviour — using the wrong function or ignoring multibyte variants causes data corruption with non-ASCII content.

Common Mistakes

  • Using strlen() instead of mb_strlen() for multibyte strings — strlen() counts bytes, not characters.
  • Using strtolower() on Unicode strings — use mb_strtolower() with explicit encoding.
  • Forgetting that strpos() returns 0 for a match at the start — check with !== false, not if($pos).
  • Using substr() to split multibyte strings — use mb_substr() to avoid splitting multi-byte characters.

Code Examples

✗ Vulnerable
// Multibyte string handled with byte functions:
$str = 'Héllo';
echo strlen($str);   // 6 — counts bytes, not chars ('é' is 2 bytes in UTF-8)
echo strtoupper($str); // 'HéLLO' — é not uppercased

// Correct:
echo mb_strlen($str, 'UTF-8');    // 5 — character count
echo mb_strtoupper($str, 'UTF-8'); // 'HÉLLO'
✓ Fixed
// PHP 8.0 str_contains / str_starts_with / str_ends_with
if (str_contains($haystack, 'needle')) {}
if (str_starts_with($url, 'https://')) {}
if (str_ends_with($file, '.php')) {}

// Multi-byte safe functions (always use for user text)
$len  = mb_strlen($str, 'UTF-8');
$up   = mb_strtoupper($str, 'UTF-8');
$sub  = mb_substr($str, 0, 100, 'UTF-8');
$pos  = mb_strpos($str, 'needle', 0, 'UTF-8');

// Padding, trimming
$padded  = str_pad('42', 5, '0', STR_PAD_LEFT); // '00042'
$trimmed = trim($str, "\t\n\r\0\x0B"); // default trim chars

// sprintf for safe string building
$sql  = sprintf('WHERE id = %d AND name = %s', $id, $pdo->quote($name));
$msg  = sprintf('Welcome, %s! You have %d messages.', $name, $count);

Added 15 Mar 2026
Edited 22 Mar 2026
Views 26
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 0 pings F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W 0 pings T 0 pings F 2 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 1 ping S 0 pings S 0 pings M 1 ping T 0 pings W 0 pings T 0 pings F 2 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F
No pings yet today
No pings yesterday
Amazonbot 8 Perplexity 7 Unknown AI 3 Google 2 Ahrefs 2
crawler 20 crawler_json 1 pre-tracking 1
DEV INTEL Tools & Severity
🟡 Medium ⚙ Fix effort: Low
⚡ Quick Fix
Use mb_* variants (mb_strlen, mb_strtolower, mb_substr) for any user-facing text — plain strlen() counts bytes not characters and breaks silently on multibyte UTF-8 input
📦 Applies To
PHP 5.0+ web cli queue-worker
🔗 Prerequisites
🔍 Detection Hints
strlen() on user-submitted text; strtolower() on international names; substr() cutting in the middle of a multibyte character
Auto-detectable: ✓ Yes phpstan phpcs
⚠ Related Problems
🤖 AI Agent
Confidence: High False Positives: Medium ✓ Auto-fixable Fix: Low Context: Line Tests: Update

✓ schema.org compliant