Regex with grep, sed & awk
debt(d7/e3/b3/t5)
Closest to 'only careful code review or runtime testing' (d7), misuse of grep/sed/awk (wrong tool, missing -E, unsafe -i) isn't flagged by automated tools — shellcheck catches some shell issues but not regex flavor mismatches or tool-choice errors.
Closest to 'simple parameterised fix' (e3), quick_fix shows swapping grep for awk or adding -E is a small command-line adjustment, not a multi-file refactor.
Closest to 'localised tax' (b3), applies_to is CLI only — choice of text tool affects scripts/ops workflows but doesn't shape the application architecture.
Closest to 'notable trap' (t5), the misconception that grep/sed/awk are interchangeable plus POSIX vs ERE regex differences are documented gotchas most sysadmins eventually learn.
Also Known As
TL;DR
Explanation
grep: filter lines matching a pattern. grep -E for extended regex (ERE), grep -P for PCRE, grep -v to invert, grep -r for recursive. sed: stream editor — s/pattern/replacement/flags for substitution, d for delete, p for print. sed -i for in-place edit, sed -i.bak for backup. awk: field-based processing — splits input into fields ($1, $2...), supports conditions, arithmetic, and custom output. Essential combos: grep | awk for filter+format, sed | grep for transform+filter. For PHP log analysis, deployment scripts, and server administration.
Common Misconception
Why It Matters
Common Mistakes
- Using grep when awk is needed for field extraction — grep can't access specific fields.
- sed -i without .bak backup — irreversible in-place edits without testing first.
- Grepping binary files — add -a or convert to text first.
- POSIX regex with + ? | — these are ERE; use grep -E or egrep for extended regex.
Code Examples
# Log analysis without proper tools:
# Manually reading 500MB nginx log to find slow requests
# grep 'ERROR' app.log | wc -l -- correct but misses details
# sed s/foo/bar app.log > app.log -- truncates file! (can't read and write same file)
# grep — filter PHP errors in the last hour:
grep 'PHP Fatal error' /var/log/php/error.log | tail -100
# awk — extract IPs with > 100 requests from nginx log:
awk '{print $1}' /var/log/nginx/access.log \
| sort | uniq -c | sort -rn | head -20
# sed — replace DB host in config (with backup):
sed -i.bak 's/db.old.internal/db.new.internal/g' /var/www/app/.env
# Combined — show slow requests (> 1s) with their URLs:
awk '$NF > 1.0 {print $7, $NF}' /var/log/nginx/access.log \
| sort -k2 -rn | head -20