Fuzzy Search
Also Known As
typo tolerance
Levenshtein distance
approximate matching
edit distance
TL;DR
Matching strings that are similar but not identical — tolerating typos, transpositions, and misspellings using edit distance algorithms.
Explanation
Fuzzy search uses edit distance (Levenshtein distance): the minimum number of single-character edits to transform one string to another. Distance 1 matches one typo; distance 2 matches two. Elasticsearch's fuzzy query and Meilisearch/Typesense's built-in typo tolerance handle this automatically. For PHP, similar_text() and levenshtein() compute distances. Trigram indexes (PostgreSQL pg_trgm) enable fuzzy matching with database indexes.
Diagram
flowchart LR
QUERY[User types phyton] --> FUZZY{Fuzzy matching}
FUZZY -->|Levenshtein distance| EDIT[Edit distance = 1<br/>1 char different]
EDIT --> MATCH[Matches: python]
subgraph Trigram Similarity
TRI[Split into trigrams<br/>php = _ph ph_ php]
OVER[Overlap score<br/>phyton vs python = 0.71]
TRI --> OVER --> RESULT[Ranked matches]
end
subgraph Phonetic
SOUND[Soundex Metaphone<br/>similar sounding words]
end
subgraph Tools
MEIL[Meilisearch - built-in typo tolerance]
PG[PostgreSQL pg_trgm extension]
ES[Elasticsearch fuzzy query]
end
style MATCH fill:#238636,color:#fff
style RESULT fill:#238636,color:#fff
style MEIL fill:#1f6feb,color:#fff
Common Misconception
✗ Fuzzy search matches everything loosely — good fuzzy search is calibrated to distance 1-2, which matches real typos without matching semantically unrelated words.
Why It Matters
Users typo queries — 'seach' for 'search', 'recieve' for 'receive' — without fuzzy matching, they see zero results for a query you can serve; fuzzy matching converts failed searches to successful ones.
Common Mistakes
- Fuzzy distance too high — distance 3+ matches too many unrelated terms, reducing relevance.
- Fuzzy matching on every field — apply fuzzy only to text fields, not IDs or structured data.
- Not using AUTO fuzziness — Elasticsearch's AUTO:3,6 applies no fuzziness for short terms, distance 1 for 3-5 chars, distance 2 for 6+ chars.
- Levenshtein in PHP application code on every row — O(n) for n documents; use indexed fuzzy search.
Code Examples
✗ Vulnerable
// PHP Levenshtein on all rows — O(n), unusable at scale:
$query = 'seach';
$results = $db->query('SELECT * FROM products')->fetchAll();
$fuzzyResults = array_filter($results, function($product) use ($query) {
return levenshtein($query, strtolower($product['name'])) <= 2;
});
// Scans all products in PHP — not viable for large datasets
✓ Fixed
// Elasticsearch fuzzy query — indexed, fast:
$query = [
'query' => [
'match' => [
'name' => [
'query' => $searchTerm,
'fuzziness' => 'AUTO', // AUTO:3,6 — sensible defaults
'prefix_length' => 2, // First 2 chars must match exactly
]
]
]
];
// PostgreSQL pg_trgm for simpler setups:
// CREATE INDEX idx_products_name_trgm ON products USING gin(name gin_trgm_ops);
// SELECT * FROM products WHERE name % 'seach' ORDER BY name <-> 'seach' LIMIT 10;
References
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
15 Mar 2026
Edited
22 Mar 2026
Views
24
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 0
No pings yet today
No pings yesterday
Amazonbot 8
Perplexity 6
Google 4
Unknown AI 3
Ahrefs 2
Also referenced
How they use it
crawler 19
crawler_json 3
pre-tracking 1
Related categories
⚡
DEV INTEL
Tools & Severity
🟡 Medium
⚙ Fix effort: Medium
⚡ Quick Fix
Enable fuzzy search in Meilisearch (it's on by default) or use Levenshtein distance for simple PHP implementations — never use LIKE '%term%' which can't do fuzzy matching
📦 Applies To
any
web
api
🔗 Prerequisites
🔍 Detection Hints
Exact match only search with no typo tolerance; users getting no results for misspelled queries; soundex() levenshtein() in MySQL per-row scan
Auto-detectable:
✗ No
meilisearch
elasticsearch
typesense
⚠ Related Problems
🤖 AI Agent
Confidence: Low
False Positives: Medium
✗ Manual fix
Fix: Medium
Context: File
Tests: Update