PHP Generators
debt(d7/e3/b3/t5)
Closest to 'only careful code review or runtime testing' (d7). The term's detection_hints explicitly state automated detection is 'no'. Identifying functions that return large arrays from files/DBs that should be generators requires manual code review or runtime memory profiling. No standard linter or SAST tool flags 'this array-returning function should be a generator' — it's a design choice visible only through review or when memory issues manifest at runtime.
Closest to 'simple parameterised fix' (e3). The quick_fix states 'Replace array-returning functions that process large datasets with generators using yield'. This is a mechanical transformation within a single function — change return statements to yield statements and adjust the return type. However, callers expecting arrays may need updates (foreach works, but direct array access doesn't), which can touch a few files but remains a localized refactor.
Closest to 'localised tax' (b3). Generators apply across web/cli/queue contexts per applies_to, but the choice to use a generator is typically localized to specific data-processing functions. It doesn't impose a system-wide architectural constraint — it's a per-function decision that affects only the immediate consumers of that function. The rest of the codebase remains unaffected.
Closest to 'notable trap' (t5). The misconception field explicitly states developers wrongly believe 'Generators are only useful for infinite sequences.' Additionally, common_mistakes reveals developers expect array returns, misuse iterator_to_array() defeating memory benefits, and miss yield from composition. These are documented gotchas that most PHP developers eventually learn through experience, but they contradict intuitions from array-centric PHP patterns.
Also Known As
TL;DR
Explanation
A generator function uses yield to return values one at a time, pausing execution between yields. The caller receives a Generator object implementing Iterator. Generators consume O(1) memory regardless of the sequence length — ideal for large datasets, file streaming, and infinite sequences. yield from delegates to another generator or iterable. send() passes values back into a paused generator. PHP 5.5+ supports generators; PHP 7.0 added return values from generators.
Common Misconception
Why It Matters
Common Mistakes
- Calling a generator function and expecting an array — it returns a Generator object; use foreach or iterator_to_array().
- Using iterator_to_array() on a large generator — defeats the memory benefit by materialising the full sequence.
- Not using yield from when composing generators — manually looping and yielding is verbose and slower.
Avoid When
- Do not use generators when you need random access to elements — generators are forward-only.
- Avoid calling iterator_to_array() on a generator for large datasets — it loads the full sequence into memory.
When To Use
- Use generators for processing large files, database result sets, or API pagination where loading everything at once would exhaust memory.
- Use generators when you need a lazy sequence — producing values only when requested.
Code Examples
// Loads entire CSV into memory — crashes on large files
function readCsvRows(string $file): array {
$rows = [];
$handle = fopen($file, 'r');
while (($row = fgetcsv($handle)) !== false) {
$rows[] = $row; // 1M rows = hundreds of MB RAM
}
return $rows;
}
// Generator: reads CSV line by line — O(1) memory
function readCsvRows(string $file): \Generator {
$handle = fopen($file, 'r');
while (($row = fgetcsv($handle)) !== false) {
yield $row;
}
fclose($handle);
}
foreach (readCsvRows('million_rows.csv') as $row) {
processRow($row); // only one row in memory at a time
}