← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Elasticsearch Fundamentals

search PHP 7.0+ Intermediate

Also Known As

Elasticsearch ES elastic search ELK stack OpenSearch

TL;DR

A distributed search and analytics engine built on Lucene — storing documents as JSON, indexing them automatically, and providing a REST API for full-text search, aggregations, and real-time analytics.

Explanation

Elasticsearch stores data as JSON documents in indexes (analogous to database tables). Each index is split into shards (distributed across nodes) and each shard has replicas for fault tolerance. Documents are indexed automatically — field types are inferred or defined in a mapping. Queries are expressed in JSON using the Query DSL — match queries for full-text, term queries for exact values, bool queries for combining conditions, and aggregations for analytics. The most common PHP integration pattern is a synchronous index-write on every database mutation (simple but adds latency) or an asynchronous queue-based indexing pipeline (reliable at scale). The official PHP client is elasticsearch/elasticsearch; a simpler alternative is ruflin/elastica.

Common Misconception

Elasticsearch is a database and can replace the primary data store. Elasticsearch is a search index — it is eventually consistent, does not support transactions, and prioritises read performance over write durability. Primary data should live in a relational or document database; Elasticsearch should contain a search-optimised projection of that data, kept in sync via events or a queue. Using Elasticsearch as the system of record leads to data loss on cluster issues.

Why It Matters

Elasticsearch solves search problems that SQL databases handle poorly — multi-field full-text search with relevance ranking, faceted filtering with counts, autocomplete with typo tolerance, and analytics aggregations over millions of documents. For PHP applications that have outgrown LIKE queries or MySQL FULLTEXT, Elasticsearch provides a step-change in search quality and performance. The REST API means no PHP extension is required — any HTTP client works, making it straightforward to integrate.

Common Mistakes

  • Using dynamic mapping in production — Elasticsearch infers field types from the first document, which often produces wrong types. Always define explicit mappings.
  • Not handling index refresh lag — newly indexed documents are not immediately searchable (default 1-second refresh interval); account for this in real-time applications.
  • Indexing entire database rows including sensitive fields — index only the fields needed for search and display, not passwords, tokens, or PII.
  • Using Elasticsearch as the source of truth and skipping the primary database — always write to the database first, then index asynchronously.

Code Examples

✗ Vulnerable
// Indexing entire user row including sensitive fields
$client->index([
    'index' => 'users',
    'id'    => $user['id'],
    'body'  => $user, // includes password_hash, api_token, 2fa_secret
]);
✓ Fixed
// Index only search-relevant fields
$client->index([
    'index' => 'users',
    'id'    => $user['id'],
    'body'  => [
        'name'     => $user['name'],
        'email'    => $user['email'], // only if searching by email
        'bio'      => $user['bio'],
        'skills'   => $user['skills'],
        'joined'   => $user['created_at'],
    ],
]);

// Match query — full-text with relevance
$results = $client->search([
    'index' => 'users',
    'body'  => [
        'query' => [
            'multi_match' => [
                'query'  => $searchTerm,
                'fields' => ['name^2', 'bio', 'skills'], // name boosted
            ]
        ]
    ]
]);

Added 23 Mar 2026
Edited 5 Apr 2026
Views 41
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 0 pings S 2 pings S 0 pings M 0 pings T 0 pings W 0 pings T 1 ping F 2 pings S 0 pings S 0 pings M 1 ping T 1 ping W 0 pings T 1 ping F 1 ping S 0 pings S 0 pings M 0 pings T 0 pings W 1 ping T
No pings yesterday
Amazonbot 14 Perplexity 9 Google 3 Ahrefs 3 SEMrush 3 ChatGPT 1 Majestic 1
crawler 33 crawler_json 1
DEV INTEL Tools & Severity
🔵 Info ⚙ Fix effort: High
⚡ Quick Fix
Define explicit mappings before indexing, write to your database first then index asynchronously via a queue, use the match query for full-text and term query for exact values
📦 Applies To
PHP 7.0+ web cli

✓ schema.org compliant