← Home ← Codex ← DEBT
Browse by Category
+ added · updated 7d
← Back to glossary

API Rate Limiting

API Design Intermediate
debt(d5/e5/b5/t5)
d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). The term's detection_hints list owasp-zap and semgrep as tools that can detect missing rate limiting headers, absence of 429 responses, and lack of per-tier differentiation. These are specialist security/SAST tools, not default linters. Without them, missing rate limiting is only caught in runtime testing or code review, but since the tools exist and are listed, d5 is appropriate.

e5 Effort Remediation debt — work required to fix once spotted

Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix describes implementing per-API-key rate limits in Redis with proper headers and 429 responses. While conceptually straightforward, this involves adding Redis infrastructure, middleware/gateway configuration, header injection across responses, and potentially differentiating limits per endpoint. This is more than a one-line fix (not e1/e3) but typically stays within the API layer rather than requiring full architectural rework (not e7+).

b5 Burden Structural debt — long-term weight of choosing wrong

Closest to 'persistent productivity tax' (b5). Rate limiting applies to web and API contexts and touches multiple concerns: gateway configuration, Redis infrastructure, per-endpoint limit tuning, header management, and monitoring. Every new endpoint needs rate limit consideration, and the strategy (sliding window, token bucket, per-key vs per-IP) shapes how the API is consumed. It's not quite b7 (it doesn't reshape every change in the system), but it is a persistent tax across API development work streams.

t5 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'notable trap' (t5). The misconception field explicitly states that developers assume IP-based rate limiting is sufficient, when in reality shared NATs and proxies make IP a poor primary key. Additionally, common_mistakes include non-obvious gotchas: limiting at the application layer instead of the gateway (still consuming resources), using fixed windows that allow boundary bursts, returning wrong status codes (503/200 instead of 429), and applying uniform limits across endpoints with different costs. These are documented gotchas that most developers eventually learn but frequently get wrong initially.

About DEBT scoring →

Also Known As

rate limit throttling token bucket sliding window

TL;DR

Controlling how many requests a client can make in a time window — protecting against abuse, ensuring fair usage, and preventing accidental DoS from misbehaving clients.

Explanation

Rate limiting algorithms: Fixed Window (simple, reset at interval boundary — burst problem), Sliding Window (smoother, no burst at reset), Token Bucket (allows short bursts, refills at constant rate), Leaky Bucket (smooths bursts, constant output rate). Responses should include Retry-After and X-RateLimit-* headers. Rate limits should be keyed by API key, user ID, or IP — IP-based alone is easy to bypass. Differentiate limits by endpoint cost: search is heavier than a GET.

Diagram

flowchart TD
    REQ[API Request] --> CHECK{Rate limit<br/>check}
    CHECK -->|under limit| PROC[Process request]
    CHECK -->|exceeded| BLOCK[429 Too Many Requests<br/>Retry-After header]
    subgraph Algorithms
        FIXED[Fixed Window<br/>100 req per minute]
        SLIDE[Sliding Window<br/>smoother]
        TOKEN[Token Bucket<br/>allows bursts]
        LEAK[Leaky Bucket<br/>constant rate]
    end
    subgraph Limit By
        IP[Per IP]
        USER[Per user/API key]
        ENDPOINT[Per endpoint]
        GLOBAL[Global]
    end
style PROC fill:#238636,color:#fff
style BLOCK fill:#f85149,color:#fff

Watch Out

A fixed-window rate limiter allows double the intended request rate at window boundaries — a client making requests at the end of one window and the start of the next gets 2× quota in a short burst.

Common Misconception

IP-based rate limiting is sufficient — behind shared NAT or office proxies, thousands of users share one IP; use API key or user ID as the primary rate limit key.

Why It Matters

Without rate limiting, a single misbehaving client can exhaust all server resources — rate limiting protects availability for all users and is a first-line defence against credential stuffing.

Common Mistakes

  • Not returning Retry-After header — clients must implement exponential backoff without it.
  • Rate limiting at the application layer instead of at the gateway/nginx level — late-stage limiting still consumes resources.
  • Same rate limit for all endpoints — expensive operations (search, export) need tighter limits than simple GETs.
  • Not returning 429 Too Many Requests — some APIs return 503 or 200, confusing clients about whether to retry.

Avoid When

  • Do not rate limit without telling the client what the limits are — silent 429s cause clients to retry aggressively and worsen the problem.
  • Avoid applying identical limits to all endpoints — a read endpoint and a payment endpoint have very different abuse profiles.
  • Do not rely solely on application-layer rate limiting for DoS protection — volumetric attacks must be absorbed at the gateway or CDN layer.

When To Use

  • Always return Retry-After and X-RateLimit-* headers so well-behaved clients can implement automatic backoff.
  • Apply rate limits at multiple granularities: per IP for unauthenticated traffic, per API key for authenticated traffic.
  • Use a sliding window or token bucket algorithm for smooth limiting — fixed windows allow bursts at window boundaries.

Code Examples

💡 Note
The bad 429 response gives no retry guidance; the client retries immediately and amplifies the load. The fix includes Retry-After and remaining-quota headers so clients back off correctly.
✗ Vulnerable
// No rate limit headers — client cannot implement backoff:
HTTP/1.1 429 Too Many Requests
{"error": "Rate limited"}
// Client has no idea when to retry — exponential backoff from scratch
✓ Fixed
// Rate limit with helpful headers:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1711270800
Retry-After: 47
{"type": "rate_limit_exceeded", "retry_after": 47}

Added 15 Mar 2026
Edited 31 Mar 2026
Views 54
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
1 ping T 0 pings W 1 ping T 0 pings F 0 pings S 0 pings S 1 ping M 0 pings T 0 pings W 1 ping T 2 pings F 1 ping S 2 pings S 1 ping M 1 ping T 2 pings W 2 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 1 ping S 0 pings S 1 ping M 0 pings T 0 pings W
No pings yet today
No pings yesterday
Amazonbot 10 Scrapy 8 Perplexity 5 Google 4 Ahrefs 4 SEMrush 3 Unknown AI 2 Claude 2 ChatGPT 2 PetalBot 2 Majestic 1 Bing 1 Meta AI 1 Sogou 1
crawler 42 crawler_json 4
DEV INTEL Tools & Severity
🟠 High ⚙ Fix effort: Medium
⚡ Quick Fix
Implement per-API-key rate limits in Redis; return X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset headers; respond 429 with Retry-After on exhaustion
📦 Applies To
any web api laravel symfony
🔗 Prerequisites
🔍 Detection Hints
API without rate limiting headers; no differentiation between rate limits for different API key tiers; no 429 response
Auto-detectable: ✓ Yes owasp-zap semgrep
⚠ Related Problems
🤖 AI Agent
Confidence: Medium False Positives: Medium ✗ Manual fix Fix: Medium Context: File Tests: Update
CWE-770


✓ schema.org compliant