Vector Databases
Also Known As
pgvector
Pinecone
Weaviate
Qdrant
ANN search
TL;DR
Databases specialised for storing and querying high-dimensional vectors — enabling fast approximate nearest-neighbour search across millions of embeddings.
Explanation
Traditional databases cannot efficiently query 'find the 10 most similar vectors to this query vector' across millions of rows. Vector databases use specialised index structures (HNSW, IVF) for approximate nearest-neighbour (ANN) search. Options: Pinecone (managed), Weaviate (self-hosted or managed), Qdrant (Rust, self-hosted), pgvector (PostgreSQL extension — good starting point). For PHP applications, pgvector requires no new infrastructure and supports hybrid search (vector + SQL filters).
Common Misconception
✗ You need a dedicated vector database to use embeddings — pgvector adds vector search to PostgreSQL; for most applications it is sufficient without adding infrastructure complexity.
Why It Matters
Vector databases are the storage layer for RAG systems — without one, semantic search requires computing similarity against every stored embedding on every query, which is O(n) and unusable at scale.
Common Mistakes
- Not creating an index on the vector column — without HNSW or IVF index, queries are O(n) exact search.
- Storing vectors as JSON strings — use the native vector type (pgvector's vector type) for efficient indexing.
- Not filtering by metadata before vector search — combining SQL WHERE clauses with vector search (hybrid search) dramatically reduces the search space.
- Choosing a hosted vector DB before trying pgvector — pgvector handles millions of vectors adequately and eliminates an extra service.
Code Examples
✗ Vulnerable
// Exact search over all vectors — O(n), unusable at scale:
SELECT id, content,
embedding <=> $1 AS distance
FROM documents
ORDER BY distance
LIMIT 10;
-- No index: scans all rows, 1M docs = seconds per query
✓ Fixed
-- pgvector with HNSW index — O(log n) approximate search:
CREATE EXTENSION IF NOT EXISTS vector;
ALTER TABLE documents ADD COLUMN embedding vector(1536);
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
-- Hybrid search: filter then vector search:
SELECT id, content, embedding <=> $1 AS distance
FROM documents
WHERE category = 'security' -- SQL filter reduces search space
ORDER BY distance
LIMIT 10;
Tags
🤝 Adopt this term
£79/year · your link shown here
Added
15 Mar 2026
Edited
22 Mar 2026
Views
30
🤖 AI Guestbook educational data only
|
|
Last 30 days
Agents 1
No pings yesterday
Amazonbot 9
Perplexity 6
Unknown AI 2
Google 2
Ahrefs 2
ChatGPT 2
SEMrush 2
Also referenced
How they use it
crawler 23
crawler_json 2
Related categories
⚡
DEV INTEL
Tools & Severity
🟡 Medium
⚙ Fix effort: Medium
⚡ Quick Fix
Start with pgvector (PostgreSQL extension) if you already run Postgres — it avoids an extra service; move to a dedicated vector DB (Pinecone, Qdrant) only when you need ANN at scale
📦 Applies To
any
web
cli
🔗 Prerequisites
🔍 Detection Hints
Brute-force cosine similarity over entire embedding table without ANN index; pgvector without HNSW or IVFFlat index
Auto-detectable:
✗ No
⚠ Related Problems
🤖 AI Agent
Confidence: Low
False Positives: Medium
✗ Manual fix
Fix: High
Context: File
Tests: Update