← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

End-to-End Testing

Testing Intermediate
debt(d7/e7/b7/t7)
d7 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'only careful code review or runtime testing' (d7). The detection_hints note automated=no, and the code_pattern is 'test suite dominated by E2E tests; slow flaky CI pipeline >30 minutes' — this is visible only through CI metrics, code review of the test pyramid shape, or when the team notices chronic flakiness and slowness. Tools like Playwright/Cypress run the tests but don't flag over-reliance on E2E; no static analysis catches a bad test pyramid ratio automatically.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). The quick_fix says to move logic to unit tests and restrict E2E to 5 critical journeys. Reversing a test suite dominated by E2E tests means rewriting many existing E2E scenarios as unit/integration tests, restructuring CI pipelines, and establishing new testing conventions across all teams — this is a cross-cutting effort touching many files, pipelines, and developer habits rather than a simple parameterised fix.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (e7... b7). A test strategy dominated by E2E tests imposes a persistent productivity tax on every team member: slow CI blocks every PR, flaky tests require triage, and every new feature is expected to follow the same pattern. The applies_to scope is web broadly, and the common_mistakes confirm high maintenance burden and distrust cascading across the whole suite. Not quite b9 (doesn't rewrite the architecture), but every change is shaped by the slow feedback loop.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap' (t7). The canonical misconception is explicitly stated: 'More E2E tests means better coverage.' This directly contradicts the test pyramid principle that most competent developers have learned in other testing contexts, where 'more tests = better' is usually correct. The trap is that E2E tests feel thorough but introduce fragility and slowness — a well-documented gotcha that developers repeatedly fall into, especially those coming from manual QA backgrounds or greenfield projects.

About DEBT scoring →

Also Known As

E2E testing Playwright Cypress browser testing

TL;DR

Testing complete user flows through a real browser against a running application — verifying that all layers work together from UI to database.

Explanation

E2E tests (Playwright, Cypress) drive a real browser: click buttons, fill forms, assert page content. They test the full stack — frontend, API, backend, database — in a realistic environment. The trade-off: slow (seconds per test), flaky (timing issues, external dependencies), and expensive to maintain. The test pyramid says E2E should be sparse — cover critical user journeys (checkout, signup, core workflow) and leave edge cases to unit and integration tests.

Diagram

flowchart LR
    PLAYWRIGHT[Playwright or Cypress] --> BROWSER2[Real browser<br/>Chromium Firefox]
    BROWSER2 --> CLICK[Click buttons fill forms<br/>assert page content]
    CLICK --> STACK[Tests full stack<br/>UI API DB]
    subgraph Best_Practices
        WAIT[Wait for elements<br/>not arbitrary timeouts]
        SELECTORS[Use data-testid selectors<br/>resilient to style changes]
        SPARSE[Few E2E tests<br/>critical paths only]
        PARALLEL[Run in parallel<br/>across browsers]
    end
    subgraph Failures
        SCREENSHOT[Capture screenshot on fail]
        VIDEO[Record video on fail]
    end
style STACK fill:#6e40c9,color:#fff
style WAIT fill:#238636,color:#fff
style SCREENSHOT fill:#1f6feb,color:#fff

Common Misconception

More E2E tests means better coverage — E2E tests are slow and fragile; a test pyramid with many unit tests, some integration tests, and few E2E tests is more reliable and faster.

Why It Matters

E2E tests catch integration failures that unit tests cannot — a correctly-implemented API that is called with the wrong parameters by the frontend is only caught at the E2E layer.

Common Mistakes

  • Too many E2E tests — slow CI, high maintenance burden, flaky failures breed distrust of the whole suite.
  • No wait strategies — hard-coded sleeps instead of waiting for elements; flaky tests.
  • Running E2E against production — tests create real data, send real emails, charge real cards.
  • Not capturing screenshots/videos on failure — debugging E2E failures without visual context is very slow.

Code Examples

✗ Vulnerable
// Flaky E2E — hardcoded sleep instead of waiting:
await page.click('#submit-btn');
await page.waitForTimeout(3000); // Hope 3s is enough
await expect(page.locator('#success-msg')).toBeVisible();
// Fails on slow CI, passes locally — nightmare to debug
✓ Fixed
// Playwright — wait for element, not arbitrary time:
await page.click('#submit-btn');
await expect(page.locator('#success-msg')).toBeVisible({ timeout: 10_000 });
// Waits up to 10s, passes as soon as element appears

// Structure: test critical paths only
test('user can complete checkout', async ({ page }) => {
    await page.goto('/products');
    await page.click('[data-testid="add-to-cart"]');
    await page.click('[data-testid="checkout"]');
    await page.fill('#card-number', '4242424242424242');
    await page.click('#pay-now');
    await expect(page.locator('.order-confirmation')).toBeVisible();
});

Added 15 Mar 2026
Edited 22 Mar 2026
Views 57
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 0 pings W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 1 ping T 1 ping W 3 pings T 2 pings F 2 pings S 0 pings S 0 pings M 1 ping T 1 ping W 0 pings T 1 ping F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W 1 ping T 0 pings F 0 pings S 0 pings S 0 pings M 1 ping T 0 pings W
No pings yet today
PetalBot 1
Amazonbot 8 Scrapy 8 Perplexity 6 Google 5 Ahrefs 4 ChatGPT 3 SEMrush 3 Majestic 2 Unknown AI 2 Claude 1 Bing 1 Sogou 1 PetalBot 1
crawler 43 crawler_json 2
DEV INTEL Tools & Severity
🟡 Medium ⚙ Fix effort: High
⚡ Quick Fix
Use Playwright or Cypress for E2E tests on your 5 most critical user journeys only — E2E tests are slow and brittle; keep them minimal and fix the pyramid by moving logic to unit tests
📦 Applies To
any web
🔗 Prerequisites
🔍 Detection Hints
Test suite dominated by E2E tests; full stack spun up for every test; slow flaky CI pipeline >30 minutes
Auto-detectable: ✗ No playwright cypress panther codeception
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: Medium ✗ Manual fix Fix: High Context: File Tests: Update


✓ schema.org compliant