MySQL charset=utf8mb4
Also Known As
utf8mb4
MySQL UTF-8
emoji MySQL
4-byte unicode MySQL
TL;DR
The correct MySQL character set for full Unicode support — including emoji and supplementary characters that the older utf8 charset cannot store.
Explanation
MySQL's 'utf8' charset is a 3-byte encoding that cannot store 4-byte Unicode code points (emoji, some CJK characters, mathematical symbols). 'utf8mb4' is the correct implementation of UTF-8 and supports the full Unicode range. Using 'utf8' causes silent data truncation or errors when 4-byte characters are inserted. The DSN should specify charset=utf8mb4 and the column/table/database collation should be utf8mb4_unicode_ci or utf8mb4_0900_ai_ci (MySQL 8+).
Watch Out
⚠ MySQL's 'utf8' charset is NOT real UTF-8 — it is a 3-byte subset. Only 'utf8mb4' is full UTF-8.
Common Misconception
✗ MySQL's utf8 charset is the same as UTF-8. It is not — MySQL utf8 is a 3-byte subset. Only utf8mb4 is true UTF-8.
Why It Matters
Storing emoji, multilingual content, or any 4-byte Unicode character in a utf8 column either silently truncates the string or throws an error — data loss without any warning in strict mode off.
Common Mistakes
- Specifying charset=utf8 in the DSN — silent truncation of emoji and supplementary characters.
- Mixing utf8 and utf8mb4 columns in the same table — comparison and join operations may have unexpected collation errors.
- Forgetting to set utf8mb4 at the connection level even when the table columns are utf8mb4.
Avoid When
- Do not use utf8 — it silently truncates or errors on emoji and supplementary Unicode characters.
When To Use
- Always use utf8mb4 for any table that may store user-generated content, names, or multilingual text.
- Set charset=utf8mb4 in the DSN — not via SET NAMES — to ensure it applies at the protocol level.
Code Examples
✗ Vulnerable
// Wrong: utf8 truncates emoji silently
$pdo = new PDO('mysql:host=localhost;dbname=app;charset=utf8', $user, $pass);
// INSERT 'Hello 😀' → stored as 'Hello ' (emoji silently dropped)
✓ Fixed
// Correct: utf8mb4 in DSN and SET NAMES
$pdo = new PDO('mysql:host=localhost;dbname=app;charset=utf8mb4', $user, $pass);
-- SQL: table with correct charset
CREATE TABLE posts (
id INT AUTO_INCREMENT PRIMARY KEY,
body TEXT
) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
31 Mar 2026
Views
27
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 0
No pings yet today
No pings yesterday
Perplexity 7
Google 4
Unknown AI 2
Meta AI 1
ChatGPT 1
Ahrefs 1
Also referenced
How they use it
crawler 16
⚡
DEV INTEL
Tools & Severity
🟡 Medium
⚙ Fix effort: Low
⚡ Quick Fix
Use charset=utf8mb4 in the DSN and ALTER TABLE columns to utf8mb4_unicode_ci collation
📦 Applies To
PHP 5.1+
web
cli
🔗 Prerequisites
🔍 Detection Hints
charset=utf8 in DSN or SET NAMES utf8 without mb4
Auto-detectable:
✓ Yes
semgrep
⚠ Related Problems
🤖 AI Agent
Confidence: High
False Positives: Low
✓ Auto-fixable
Fix: Low
Context: Line