← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

Memory-Mapped Files

Linux Advanced
debt(d9/e5/b5/t7)
d9 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'silent in production until users hit it' (d9). No detection_hints.tools are listed. mmap misuse — such as random single-byte reads causing page fault storms, or dirty page writeback latency causing data loss — produces no compile-time or linter warnings. Problems surface only as performance degradation or data integrity issues observed by end users under load.

e5 Effort Remediation debt — work required to fix once spotted

Closest to 'touches multiple files / significant refactor in one component' (e5). There is no quick_fix provided. Correcting mmap misuse (e.g., switching from mmap to buffered read(), adjusting opcache.memory_consumption, or handling writeback latency with msync()) requires understanding of the I/O access pattern, changes to how files are opened and read, and potentially configuration changes across deployment. This is more than a one-liner but typically contained to a subsystem.

b5 Burden Structural debt — long-term weight of choosing wrong

Closest to 'persistent productivity tax' (b5). mmap choices affect how processes share memory (IPC), how opcache behaves across all FPM workers, and how large-file I/O is architected. This is not localised to a single call site — incorrect sizing or usage patterns impose a tax on every future developer who must reason about page faults, TLB pressure, and writeback semantics. However, it does not fully define the system's shape.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap (contradicts how a similar concept works elsewhere)' (t7). The canonical misconception is explicit: developers assume mmap is always faster than read() because it avoids a syscall per read, but the page fault overhead and TLB pressure for random single reads can make it slower. This directly contradicts the widely-held mental model that memory access is inherently faster than I/O syscalls, making it a serious cognitive trap for competent developers unfamiliar with the nuance.

About DEBT scoring →

TL;DR

A file mapped directly into a process's virtual address space — reads and writes go through the OS page cache rather than read()/write() syscalls, enabling fast access to large files and shared memory between processes.

Explanation

Memory mapping (mmap on Linux) creates a direct mapping between a file on disk and a region of virtual memory. When a process reads from that address range, the OS transparently loads the relevant page from disk via a page fault. Writes modify the page in memory and the OS flushes them to disk asynchronously. This eliminates the user-to-kernel copy that standard file I/O requires, reducing syscall overhead for large sequential reads. Multiple processes mapping the same file share the underlying pages, enabling efficient IPC (inter-process communication) without explicit message passing. PHP itself does not expose mmap directly, but it is used internally by opcache to share the compiled opcode cache between FPM workers — all workers read from the same shared memory region rather than each maintaining a private copy. SQLite uses mmap for its WAL-mode readers. PHP extensions can use mmap via FFI or C extensions for performance-critical work.

Watch Out

A process that maps a file and the file is truncated or deleted by another process will receive a SIGBUS (bus error) on the next access — production systems that use mmap must handle this signal or guard against concurrent modification.

Common Misconception

Memory-mapped files are not always faster than read() — for small files or random single reads, the page fault overhead and TLB pressure can make mmap slower than a simple file_get_contents().

Why It Matters

PHP opcache uses mmap to share compiled bytecode across all FPM workers — understanding this explains why opcache.memory_consumption must be tuned to fit the entire codebase, and why a too-small value causes silent re-compilation.

Common Mistakes

  • Assuming mmap is always the fastest I/O method — random single-byte reads across a large mapping cause repeated page faults that are slower than buffered read().
  • Mapping very large files on 32-bit systems — virtual address space is limited to ~3 GB, making large mappings impractical or impossible.
  • Not accounting for dirty page writeback latency — a write to a mapped page is not on disk until the OS flushes it, which is asynchronous by default.

Avoid When

  • Avoid mmap for small files — the setup overhead and page table entries outweigh the benefit vs file_get_contents().
  • Do not use mmap for append-heavy workloads — extending a mapping requires re-mapping, which is expensive.

When To Use

  • Use mmap for large sequential reads of files that exceed available RAM — the OS streams pages in and evicts them transparently.
  • Use shared anonymous mappings for IPC between processes on the same host where a message queue would add unnecessary overhead.
  • Tune opcache.memory_consumption based on your codebase size — it uses mmap internally and insufficient size causes re-compilation.

Code Examples

💡 Note
The bad opcache config is too small — once full, PHP silently falls back to re-compiling scripts on each request, negating the cache entirely. The good config sizes memory to the codebase and disables per-request timestamp checking in production.
✗ Vulnerable
; opcache too small — workers silently recompile scripts on every request:
opcache.memory_consumption = 64   ; MB — too small for a 500-file app
opcache.max_accelerated_files = 2000
✓ Fixed
; Size to fit entire compiled codebase + headroom:
; Check current usage: opcache_get_status()['memory_usage']
opcache.memory_consumption = 256  ; MB
opcache.max_accelerated_files = 10000
opcache.validate_timestamps = 0   ; production: disable stat() per request

Added 31 Mar 2026
Views 37
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 0 pings W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 1 ping F 1 ping S 1 ping S 1 ping M 1 ping T 0 pings W 1 ping T 1 ping F 1 ping S 0 pings S 0 pings M 0 pings T 1 ping W 0 pings T 2 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Google 7 Perplexity 3 Ahrefs 3 Scrapy 3 Meta AI 2 Claude 2 Sogou 1 Majestic 1 Bing 1
crawler 20 crawler_json 3


✓ schema.org compliant