← Back to glossary

Key String Functions (str_contains, str_starts_with, str_ends_with…)

php PHP 5.0+ Beginner

debt(d5/e3/b5/t7)

d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). The detection_hints list phpstan and phpcs as the tools that catch multibyte misuse patterns (strlen() on user text, strtolower() on international names). These are not default linters but specialist static analysis tools that must be configured and run deliberately — so d5 fits squarely.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3). The quick_fix is a direct function-name swap pattern: replace strlen() with mb_strlen(), strtolower() with mb_strtolower(), substr() with mb_substr(). Each replacement is mechanical but must be applied across multiple call sites in a codebase wherever user-facing text is handled — slightly more than a single one-line patch but well within one component or a search-replace pass.

b5 Burden Structural debt — long-term weight of choosing wrong

Closest to 'persistent productivity tax' (b5). The misconception affects all three contexts listed (web, cli, queue-worker) and the functions are used pervasively throughout any PHP application. Every developer touching string handling must remember the mb_ discipline, and any code review touching user-facing text carries this ongoing cognitive tax. It doesn't reshape architecture (not b7/b9) but it does slow down multiple work streams.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7). The misconception field states the canonical wrong belief explicitly: developers assume built-in str_ functions handle multibyte strings correctly, but strlen('café') returns 5 not 4. This contradicts how string functions work in many other languages where string length means character count. The strpos() returning 0 gotcha (check !== false not truthiness) is an additional well-known trap layered on top, compounding the cognitive debt.

About DEBT scoring → scored by claude-sonnet-4-6 · 2026-05-11 · reviewed by human

Also Known As

PHP string functions str_contains str_starts_with mb_string

TL;DR

PHP 8.0 introduced readable string helpers that replace strpos() !== false idioms for common substring checks.

Explanation

PHP 8.0 added str_contains(), str_starts_with(), and str_ends_with() — boolean functions replacing the error-prone strpos($hay, $needle) !== false pattern (which was false for position 0 without the strict check). Other essential string functions include mb_* variants for multibyte safety (mb_strlen, mb_strtolower), sprintf() for formatted output, trim/ltrim/rtrim for whitespace, explode/implode for splitting/joining, and str_replace/preg_replace for substitution. Always use mb_* functions when handling user input that may contain multibyte characters.

Common Misconception

✗ PHP's built-in string functions handle multibyte strings correctly. Most str_ functions operate on bytes, not characters — strlen("café") returns 5 not 4 on UTF-8 input. Always use the mb_ equivalents (mb_strlen, mb_substr) for any user-facing string handling.

Why It Matters

PHP has over 100 string functions with subtle differences in argument order, return values, and multibyte behaviour — using the wrong function or ignoring multibyte variants causes data corruption with non-ASCII content.

Common Mistakes

Using strlen() instead of mb_strlen() for multibyte strings — strlen() counts bytes, not characters.
Using strtolower() on Unicode strings — use mb_strtolower() with explicit encoding.
Forgetting that strpos() returns 0 for a match at the start — check with !== false, not if($pos).
Using substr() to split multibyte strings — use mb_substr() to avoid splitting multi-byte characters.

Code Examples

✗ Vulnerable

// Multibyte string handled with byte functions:
$str = 'Héllo';
echo strlen($str);   // 6 — counts bytes, not chars ('é' is 2 bytes in UTF-8)
echo strtoupper($str); // 'HéLLO' — é not uppercased

// Correct:
echo mb_strlen($str, 'UTF-8');    // 5 — character count
echo mb_strtoupper($str, 'UTF-8'); // 'HÉLLO'

✓ Fixed

// PHP 8.0 str_contains / str_starts_with / str_ends_with
if (str_contains($haystack, 'needle')) {}
if (str_starts_with($url, 'https://')) {}
if (str_ends_with($file, '.php')) {}

// Multi-byte safe functions (always use for user text)
$len  = mb_strlen($str, 'UTF-8');
$up   = mb_strtoupper($str, 'UTF-8');
$sub  = mb_substr($str, 0, 100, 'UTF-8');
$pos  = mb_strpos($str, 'needle', 0, 'UTF-8');

// Padding, trimming
$padded  = str_pad('42', 5, '0', STR_PAD_LEFT); // '00042'
$trimmed = trim($str, "\t\n\r\0\x0B"); // default trim chars

// sprintf for safe string building
$sql  = sprintf('WHERE id = %d AND name = %s', $id, $pdo->quote($name));
$msg  = sprintf('Welcome, %s! You have %d messages.', $name, $count);

References

Tags

Added 15 Mar 2026

Edited 22 Mar 2026

Curated in Warsaw under one editorial standard. 1,448 terms, single voice. About this reference →

Rate this term

No ratings yet

🤖 AI Guestbook educational data only

| |

Last 30 days

Agents 0

No pings yet today

No pings yesterday

Amazonbot 8 Perplexity 7 Unknown AI 3 Google 2 Ahrefs 2

Also referenced

filter_var() 24 preg_match() / preg_replace() 12

How they use it

crawler 20 crawler_json 1 pre-tracking 1

Related categories

php 7.3k

⚡ DEV INTEL Tools & Severity

🟡 Medium ⚙ Fix effort: Low

⚡ Quick Fix

Use mb_* variants (mb_strlen, mb_strtolower, mb_substr) for any user-facing text — plain strlen() counts bytes not characters and breaks silently on multibyte UTF-8 input

📦 Applies To

PHP 5.0+ web cli queue-worker

🔗 Prerequisites

php string functions Unicode Fundamentals PCRE in PHP

🔍 Detection Hints

strlen() on user-submitted text; strtolower() on international names; substr() cutting in the middle of a multibyte character

Auto-detectable: ✓ Yes phpstan phpcs

⚠ Related Problems

Unicode Fundamentals Character Encoding php string functions

🤖 AI Agent

Confidence: High False Positives: Medium ✓ Auto-fixable Fix: Low Context: Line Tests: Update