Abstract Syntax Tree (AST)
debt(d5/e5/b3/t5)
Closest to 'specialist tool catches it' (d5). The misuse of ASTs — specifically parsing PHP with regex instead of using php-parser — can be detected by tools like php-parser, phpstan, and rector (listed in detection_hints.tools). These are specialist tools, not default linters. The code_pattern hint (preg_match parsing PHP syntax or string-based code transformation) is something a SAST or custom rule could flag, but it won't be caught by a compiler or basic linter.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix says 'use nikic/php-parser and operate on AST nodes' instead of regex — but replacing regex-based PHP parsing with proper AST-based parsing is not a one-line swap. It requires adding a dependency, rewriting the parsing logic to use NodeVisitor patterns, and handling multiple node types. This is a significant refactor within one component, potentially touching multiple files if regex parsing was used in several places.
Closest to 'localised tax' (b3). ASTs apply primarily to CLI/tooling contexts (applies_to: cli). Understanding and using ASTs is a localised concern — it affects the code analysis or transformation tool you're building, but doesn't impose a persistent tax on the rest of the codebase. It's a tooling choice, not an architectural one that shapes the entire system.
Closest to 'notable trap' (t5). The misconception states 'ASTs are only for compiler authors' — this is a documented gotcha that many PHP developers eventually learn when they encounter PHPStan, Rector, or PHP CS Fixer. Additionally, common_mistakes include confusing AST with CST, handling only a subset of node types, and mutating nodes incorrectly during traversal. These are real traps but they are well-documented and most competent developers learn them through normal usage rather than being catastrophically surprised.
Also Known As
TL;DR
Explanation
An Abstract Syntax Tree (AST) represents source code as a hierarchical tree of syntactic constructs such as expressions, statements, and declarations. The 'abstract' aspect means non-essential syntax details (whitespace, comments, formatting, and often redundant parentheses) are removed, focusing only on semantic structure. ASTs are produced by parsers after tokenisation and are the foundation for compilers, static analysers, linters, and code transformation tools. In PHP, nikic/php-parser generates ASTs that power tools like PHPStan, Psalm, Rector, and PHP CS Fixer. Understanding ASTs enables building custom static analysis rules, automated refactoring (codemods), and code generation tools.
Diagram
flowchart TD
SRC[SourceCode] --> LEXER[Lexer]
LEXER --> TOKENS[TokenStream]
TOKENS --> PARSER[Parser]
PARSER --> AST[AbstractSyntaxTree]
AST --> FUNC[FunctionDeclaration<br/>name:add]
FUNC --> PARAMS[Parameters<br/>a:int,b:int]
FUNC --> BODY[ReturnStatement]
BODY --> BINOP[BinaryExpr:+]
BINOP --> LEFT[Identifier:a]
BINOP --> RIGHT[Identifier:b]
AST --> PHPSTAN[PHPStan type check]
AST --> RECTOR[Rector refactor]
AST --> COMPILER[Compiler bytecode]
Watch Out
Common Misconception
Why It Matters
Common Mistakes
- Parsing PHP with regex — fails on real syntax like nested expressions, strings, and comments.
- Confusing AST with CST (Concrete Syntax Tree) — AST omits formatting details.
- Handling only a subset of node types — missing StaticCall or NullsafeMethodCall leads to incomplete analysis.
- Mutating nodes incorrectly during traversal instead of using proper NodeVisitor patterns.
Avoid When
- You only need simple text search without understanding syntax.
- Working with non-code structured data like JSON or XML (use native parsers instead).
When To Use
- Building static analysis tools or linters.
- Automating refactoring or codemods.
- Generating or transforming code programmatically.
- Understanding compiler or interpreter internals.
Code Examples
// ❌ Regex-based parsing — unreliable
preg_match_all('/function\s+(\w+)\s*\(/', $code, $matches);
// Breaks on closures, arrow functions, comments, strings, complex syntax
// ✅ Proper AST parsing using nikic/php-parser
use PhpParser\ParserFactory;
use PhpParser\Node;
use PhpParser\NodeTraverser;
use PhpParser\NodeVisitorAbstract;
$parser = (new ParserFactory())->createForNewestSupportedVersion();
$ast = $parser->parse($code);
$traverser = new NodeTraverser();
$traverser->addVisitor(new class extends NodeVisitorAbstract {
public function enterNode(Node $node): void {
if ($node instanceof Node\Stmt\Function_) {
echo 'Function: ' . $node->name->toString() . PHP_EOL;
}
}
});
$traverser->traverse($ast);