Test Pyramid
debt(d7/e5/b5/t3)
Closest to 'only careful code review or runtime testing' (d7). The detection_hints specify automated: no, and the code_pattern is 'Test suite taking >10 minutes dominated by E2E tests; slow brittle CI pipeline.' Tools like phpunit, pest, playwright, and cypress can surface symptoms (slow runs, flaky results), but no tool automatically flags that the pyramid is inverted — a developer or team lead must interpret the suite's composition and CI feedback. This is squarely a code-review or observation problem, not a linter catch.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix gives a 70/20/10 ratio target, but achieving it when an inverted pyramid exists means rewriting or replacing many E2E tests with integration and unit tests across multiple test files and potentially multiple components. It is not a one-line swap, but it is not necessarily a full architectural rework either — it is a significant refactor within the test suite spanning multiple files.
Closest to 'persistent productivity tax' (b5). The term applies_to web, cli, and queue-worker contexts, so an inverted pyramid affects every CI run and every development workflow across the whole application. A slow, brittle suite slows down many work streams (every PR, every deployment gate), but it does not redefine the system's architecture — it is a persistent productivity tax rather than a gravitational pull on design decisions.
Closest to 'minor surprise (one edge case)' (t3). The misconception is specific and bounded: developers believe the pyramid prescribes exactly three test types rather than understanding it as a proportionality guideline. This is a common but relatively shallow misreading — competent developers who encounter the concept usually correct quickly once they read further. The common_mistakes (all E2E, no integration, mocking everything) follow from this misunderstanding but are not catastrophic architectural errors.
Also Known As
TL;DR
Explanation
Mike Cohn's Test Pyramid describes the optimal distribution of test types. Unit tests (base) are numerous, fast, and cheap to maintain — they test logic in isolation. Integration tests (middle) are fewer, test component interactions (real database, HTTP), and run in seconds to minutes. End-to-end / UI tests (top) are fewest — they test full user flows through a browser or API client and are slow and brittle. Inverting the pyramid (many E2E, few unit tests) creates slow, fragile CI pipelines. Mutation testing and contract tests are complementary layers.
Diagram
flowchart TD
E2E[E2E Tests<br/>Few - slow, costly<br/>Playwright / Cypress]
INT[Integration Tests<br/>Some - medium speed<br/>Database, HTTP]
UNIT[Unit Tests<br/>Many - fast, cheap<br/>PHPUnit / Jest]
UNIT -->|foundation| INT -->|builds on| E2E
style E2E fill:#f85149,color:#fff
style INT fill:#d29922,color:#fff
style UNIT fill:#238636,color:#fff
Common Misconception
Why It Matters
Common Mistakes
- Relying entirely on E2E tests because they "test everything" — they are slow, flaky, and expensive to maintain.
- Having no integration tests at all — unit tests alone cannot catch interface mismatches between components.
- Writing unit tests that mock everything including the code under test — you are testing the mocks, not the logic.
- Measuring test quality by coverage percentage alone — 100% coverage with bad assertions is worthless.
Avoid When
- An inverted pyramid with mostly E2E tests — slow, flaky, and expensive tests that break on unrelated changes.
- Skipping unit tests in favour of only integration tests — integration tests are slower and harder to pinpoint failures.
- Writing tests purely for coverage metrics without asserting meaningful behaviour.
- 100% unit test coverage with no integration tests — units can all pass while their interactions are broken.
When To Use
- Structuring a new test suite — start with many fast unit tests, fewer integration tests, minimal E2E tests.
- Diagnosing a flaky test suite — a pyramid shape predicts a fast, reliable suite; an inverted pyramid predicts flakiness.
- Communicating testing strategy to a team — the pyramid is a shared mental model for where to invest test effort.
- Balancing confidence with speed — unit tests run in milliseconds, E2E tests in minutes; the pyramid optimises both.
Code Examples
// Inverted pyramid — all E2E, no units:
class TestSuite {
// 5 unit tests
// 10 integration tests
// 200 Selenium E2E tests ← inverted
}
// CI takes 45 minutes — team stops running tests locally
// E2E failures are flaky — team ignores failures
// Correct pyramid:
// 300 unit tests (fast, stable)
// 50 integration tests
// 10 E2E tests (critical paths only)
# Rough ratios for a typical PHP web app:
# Unit tests: ~70% — fast, isolated, no I/O
# Integration tests: ~20% — hit DB/cache, use transactions
# E2E/acceptance: ~10% — Playwright/Panther, slowest
# phpunit.xml — separate test suites for speed control
<testsuites>
<testsuite name="unit"> <directory>tests/Unit</directory> </testsuite>
<testsuite name="integration"> <directory>tests/Integration</directory> </testsuite>
<testsuite name="e2e"> <directory>tests/E2E</directory> </testsuite>
</testsuites>
# Run only fast tests in pre-commit hook:
$ phpunit --testsuite unit