← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Sparse Matrix Representations

data_structures Advanced

Also Known As

sparse matrix CSR COO compressed sparse row DOK

TL;DR

COO, CSR, and DOK formats efficiently store matrices where most values are zero — avoiding storing terabytes of zeros for recommendation systems and graphs.

Explanation

Dense 2D array: O(m*n) space — wasteful when <1% of values are non-zero. COO (Coordinate): store (row, col, value) triples for each non-zero — simple to build. CSR (Compressed Sparse Row): three arrays — fast row iteration and matrix-vector products. DOK (Dictionary of Keys): hash map of (row,col)→value — fast element access. Applications: recommendation systems (user×item ratings), graph adjacency, NLP term-document matrices. 1M users × 100K products densely requires 400GB — as CSR at 1% density, 4GB.

Common Misconception

Sparse matrices are only for scientific computing — any large-scale recommendation system, social graph, or NLP model benefits from sparse representation.

Why It Matters

Dense storage for sparse data is physically impossible at scale — sparse formats make recommendation systems and graph algorithms feasible on normal hardware.

Common Mistakes

  • Dense matrix for sparse data — memory infeasible
  • Wrong format for the operation — COO for building, CSR for computation
  • Not checking sparsity before choosing format
  • Building CSR directly instead of building COO then converting

Code Examples

✗ Vulnerable
// Dense matrix for user-item ratings — physically impossible:
$ratings = array_fill(0, 1000000, array_fill(0, 100000, 0.0));
// 1M * 100K * 8 bytes = 800TB — impossible
✓ Fixed
// DOK: only store non-zero ratings:
$ratings = [];
$ratings["42:1337"] = 4.5; // User 42 rated item 1337: 4.5 stars
// 1M users * 200 ratings avg = 200M entries
// Memory: 200M * ~30 bytes = 6GB — feasible

Added 16 Mar 2026
Edited 22 Mar 2026
Views 21
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
1 ping F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 1 ping F 0 pings S 0 pings S 1 ping M 0 pings T 1 ping W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 1 ping T 0 pings W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 1 ping T 0 pings W 0 pings T 0 pings F 0 pings S
No pings yet today
No pings yesterday
Amazonbot 7 Perplexity 5 Google 2 Unknown AI 2 Ahrefs 2
crawler 17 crawler_json 1
DEV INTEL Tools & Severity
🔵 Info ⚙ Fix effort: Medium
⚡ Quick Fix
Represent sparse matrices as associative arrays indexed by [row][col] only for non-zero values — a 1000x1000 matrix with 100 non-zero values needs only 100 entries, not 1,000,000
📦 Applies To
any web cli
🔗 Prerequisites
🔍 Detection Hints
Large 2D array mostly zeros wasting memory; graph adjacency matrix for sparse graph wasting O(V²) space
Auto-detectable: ✗ No blackfire php-meminfo
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: Medium ✗ Manual fix Fix: Medium Context: Function

✓ schema.org compliant