← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

stat() System Call

Linux Intermediate
debt(d8/e3/b3/t7)
d8 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'silent in production until users hit it' (d9), pulled to d8 because detection_hints.automated is 'no' and only a regex code_pattern exists. Misusing stat() vs lstat(), stale stat cache, and TOCTOU races produce no error — they succeed silently and surface as wrong metadata or race bugs in production.

e3 Effort Remediation debt — work required to fix once spotted

Closest to 'simple parameterised fix' (e3), because quick_fix is a targeted swap — use lstat()/fstat() in the right place and add clearstatcache() before re-reading. It's slightly more than a one-line patch since several call sites and cache invalidation points may need touching.

b3 Burden Structural debt — long-term weight of choosing wrong

Closest to 'localised tax' (b3), since stat() usage is typically confined to filesystem-handling components even though applies_to spans web/cli/queue/library. It does not shape the whole system, but file-metadata logic carries a persistent correctness tax where used.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7), grounded in the misconception that stat() opens/reads the file — it only reads inode metadata and succeeds without read permission. Combined with stat()-vs-lstat() symlink confusion, stale-cache filemtime, and ctime-as-creation-time mistakes, the 'obvious' assumptions are reliably wrong.

About DEBT scoring →

Also Known As

stat lstat fstat statx

TL;DR

The stat family of syscalls retrieves file metadata — size, permissions, timestamps, owner, inode — without reading the file's contents.

Explanation

stat() and its relatives (lstat(), fstat(), and the modern statx()) ask the kernel for a file's metadata: its size in bytes, its permission bits and type (st_mode), owner and group IDs (st_uid/st_gid), link count, inode number (st_ino), device ID, and three timestamps — access (atime), modification (mtime), and inode change (ctime). None of these calls open or read file data; they only return the struct stored in the file's inode. The difference between the variants matters: stat() follows symlinks and reports the target, lstat() reports the symlink itself, and fstat() works on an already-open file descriptor. statx() (Linux 4.11+) adds birth time and a request mask so the kernel can skip expensive fields on network filesystems. In PHP, the stat() / lstat() / fstat() functions and the convenience wrappers filesize(), filemtime(), is_dir(), and is_file() all sit on top of these syscalls. PHP caches the result of the most recent stat call per path; functions like clearstatcache() exist precisely because that cache can return stale metadata after a file changes. Tools like 'ls -l', 'find', 'du', and 'rsync' lean heavily on stat to decide what to display, match, or skip. Understanding stat helps you reason about performance — a directory listing of 100,000 files is 100,000 stat calls — and about correctness, since checking metadata is not the same as checking content, and a stat result can be invalidated by another process the instant after it returns.

Common Misconception

Many assume stat() opens or reads the file — it does not; it only reads the inode metadata, so it succeeds even on files you cannot read the contents of, as long as you have execute/search permission on the parent directories.

Why It Matters

Metadata lookups dominate the cost of directory scans, builds, and sync tools, and a stale or misinterpreted stat result causes TOCTOU race bugs and incorrect cache invalidation in real applications.

Common Mistakes

  • Using stat() instead of lstat() when inspecting symlinks, so you read the target's metadata and miss broken or malicious links.
  • Relying on PHP's stat cache and getting stale filesize()/filemtime() values after the file changed without calling clearstatcache().
  • Treating a successful stat() as proof of readability — the file may exist but be unreadable, or change between the check and the open (TOCTOU).
  • Confusing ctime with file creation time — ctime is the inode change time; only statx() exposes true birth time (btime).
  • Doing one stat() per file in a tight loop over huge directories instead of using readdir with d_type or batching, causing severe I/O overhead.

Avoid When

  • You need the file's actual contents — stat returns metadata only, so use read/file_get_contents instead.
  • You are scanning millions of entries and only need names or types — prefer readdir with d_type to avoid a stat per entry.

When To Use

  • You need size, timestamps, ownership, permissions, or inode number without opening or reading the file.
  • You must distinguish a symlink from its target — use lstat().
  • You already hold an open file descriptor and want race-free metadata — use fstat().

Code Examples

✗ Vulnerable
<?php
// Reads target metadata, not the symlink itself; cache may be stale
$size = filesize('/var/cache/report.json');
// ... another process truncates the file here ...
if ($size > 0) {
    // Stale: $size came from PHP's per-path stat cache
    $data = file_get_contents('/var/cache/report.json');
    process($data); // may be empty now (TOCTOU)
}
✓ Fixed
<?php
// Inspect the link itself, refresh the cache, work on an open handle
$path = '/var/cache/report.json';
clearstatcache(true, $path);

$meta = lstat($path);
if ($meta === false) {
    throw new RuntimeException("cannot stat $path");
}
if (($meta['mode'] & 0xF000) === 0xA000) {
    throw new RuntimeException('refusing to follow symlink');
}

// Note: a symlink could still be swapped in between lstat() and fopen().
// The real guard is operating on the open handle itself via fstat().
$fh = fopen($path, 'rb');
$fstat = fstat($fh); // metadata of the exact handle we will read
if ($fstat['size'] > 0) {
    $data = stream_get_contents($fh);
    process($data);
}
fclose($fh);

Added 18 Jun 2026
Views 8
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 3 pings T 3 pings F 1 ping S 0 pings S 1 ping M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Google 4 ChatGPT 2 Perplexity 1 Ahrefs 1
crawler 6 crawler_json 2
DEV INTEL Tools & Severity
🔵 Info ⚙ Fix effort: Low
⚡ Quick Fix
Use lstat() for symlinks, fstat() on open descriptors, and call clearstatcache() before re-reading PHP file metadata that may have changed.
📦 Applies To
any web cli queue-worker library
🔗 Prerequisites
🔍 Detection Hints
filesize\(|filemtime\(|\bstat\(|\blstat\(|\bfstat\(
Auto-detectable: ✗ No
⚠ Related Problems
🤖 AI Agent
Confidence: Medium False Positives: Medium ✗ Manual fix Fix: Medium Context: Function Tests: Update
CWE-367 CWE-59


✓ schema.org compliant