← CodeClarityLab Home
Browse by Category
+ added · updated 7d
← Back to glossary

Using AI APIs in PHP

ai_ml PHP 8.0+ Intermediate
debt(d5/e7/b7/t7)
d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). Semgrep can detect hardcoded API keys and missing retry logic via code patterns; TruffleHog catches leaked secrets. However, architectural issues like synchronous LLM calls in web handlers or missing token budgets require more nuanced detection — some patterns are caught by these tools, others need code review. Splitting the difference at d5.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). While the quick_fix mentions using an SDK with env vars (a simple change), the real misuses — synchronous calls blocking web requests, missing queue infrastructure, no caching layer, no fallback/degradation strategy, no token budget tracking — require introducing queue workers, caching layers, circuit breakers, and cost monitoring across the application. Moving from synchronous inline AI calls to a proper async queue-based architecture touches controllers, job classes, response handlers, and frontend polling/webhook infrastructure.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (b7). AI integration architecture choices (sync vs async, caching strategy, cost management, fallback behavior) shape how features are built across web, CLI, and queue-worker contexts. Once you pick an approach to AI integration, every AI-powered feature in the system follows that pattern. The choice of queue infrastructure, caching strategy, and error handling for AI calls becomes load-bearing across the system. Not quite b9 (it doesn't define the entire system's shape), but it strongly influences how AI features are developed going forward.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap — contradicts how a similar concept works elsewhere' (t7). The misconception is explicit: developers assume AI API calls work like any other REST API call — make a synchronous request in the web handler, get a response, render it. This mental model works for most APIs (sub-200ms responses) but fails catastrophically for LLM calls (2-30 seconds). A competent PHP developer with no AI integration experience will almost certainly build it synchronously first, causing timeouts, blocked workers, and poor UX. The 'obvious' approach (treat it like any other HTTP API call) leads directly to production problems.

About DEBT scoring →

Also Known As

OpenAI PHP Anthropic PHP LLM integration AI API

TL;DR

Integrating LLM APIs (OpenAI, Anthropic, Gemini) into PHP applications — for text generation, classification, extraction, and embedding-based search.

Explanation

PHP applications call AI APIs via HTTP — the openai-php/client and anthropic-sdk-php packages provide type-safe wrappers. Key patterns: text generation (completion), structured extraction (function calling/JSON mode), classification, embedding generation, and streaming responses. Important considerations: rate limiting, retry logic, cost management (token counting), timeout handling, and not blocking the web request for slow AI calls (offload to queues). PHP's synchronous model means long AI calls should be async via job queues.

Diagram

flowchart LR
    PHP[PHP Application] --> SDK[Anthropic OpenAI SDK<br/>or HTTP client]
    SDK -->|API call| LLM2[LLM API]
    LLM2 -->|response| PARSE[Parse response]
    subgraph Patterns
        SYNC2[Synchronous<br/>await full response]
        STREAM[Streaming<br/>chunk by chunk output]
        TOOL[Tool use<br/>function calling]
        RAG3[RAG<br/>inject context from DB]
    end
    subgraph Caching
        PROMPT_CACHE[Cache identical prompts<br/>save cost and latency]
        SEMANTIC[Semantic cache<br/>similar prompts hit cache]
    end
style SDK fill:#1f6feb,color:#fff
style STREAM fill:#238636,color:#fff
style PROMPT_CACHE fill:#d29922,color:#fff

Watch Out

PHP-FPM has a fixed worker pool — a single LLM call blocking for 10 seconds under moderate traffic can exhaust all workers and take down the entire application.

Common Misconception

AI API calls should happen synchronously in web requests — LLM calls take 1-30 seconds; web requests should queue the job and return immediately, polling or webhooking for the result.

Why It Matters

PHP developers need practical patterns for AI integration — the architecture choices (sync vs async, caching, cost management) determine whether an AI feature is production-viable.

Common Mistakes

  • Synchronous LLM calls in web request handlers — 10-second AI calls time out or block workers.
  • No token budget — uncapped prompts with large context windows generate unexpected API costs.
  • Not caching identical prompts — same question asked repeatedly incurs cost each time.
  • No fallback when API is unavailable — AI features should degrade gracefully, not cause 500 errors.

Avoid When

  • Do not call LLM APIs synchronously inside a web request handler — it ties up PHP-FPM workers for the full response duration.
  • Avoid sending sensitive PII or credentials in prompts — treat every API call as potentially logged by the provider.
  • Do not trust LLM-generated code or SQL without review — use it as a draft, not as production-ready output.

When To Use

  • Dispatch LLM API calls to a queue worker — responses take 2–30 seconds and must not block a synchronous web request.
  • Cache LLM responses for identical or near-identical prompts to reduce cost and latency on repeated queries.
  • Validate and sanitise LLM output before using it in SQL queries, HTML output, or file operations.

Code Examples

💡 Note
The bad example calls the LLM API synchronously inside a controller, blocking the worker; the fix dispatches to a queue job and returns immediately with a job ID for polling.
✗ Vulnerable
// Synchronous in web request — blocks worker for 10+ seconds:
class ArticleController {
    public function summarise(Request $req): Response {
        $summary = $this->openai->chat->completions->create([
            'model' => 'gpt-4',
            'messages' => [['role' => 'user', 'content' => $req->article]],
        ]); // Blocks for 5-15 seconds — worker unavailable for other requests
        return response()->json(['summary' => $summary->choices[0]->message->content]);
    }
}
✓ Fixed
// Queue-based — returns immediately, processes async:
class ArticleController {
    public function summarise(Request $req): Response {
        $jobId = Str::uuid();
        SummariseArticleJob::dispatch($req->article, $jobId);
        return response()->json(['job_id' => $jobId, 'status' => 'processing']);
    }
}
// Client polls: GET /api/jobs/{jobId} or uses websocket for push notification

Tags


Added 15 Mar 2026
Edited 31 Mar 2026
Views 62
Rate this term
No ratings yet
🤖 AI Guestbook educational data only
| |
Last 30 days
0 pings T 0 pings F 0 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F 2 pings S 0 pings S 0 pings M 1 ping T 0 pings W 0 pings T 4 pings F 1 ping S 0 pings S 1 ping M 1 ping T 0 pings W 0 pings T 1 ping F 4 pings S 0 pings S 0 pings M 0 pings T 0 pings W 0 pings T 0 pings F
No pings yet today
No pings yesterday
Amazonbot 19 Perplexity 15 Google 8 Ahrefs 3 ChatGPT 3 Unknown AI 2 SEMrush 1
crawler 49 crawler_json 2
DEV INTEL Tools & Severity
🟡 Medium ⚙ Fix effort: Medium
⚡ Quick Fix
Use the official SDK (anthropic-sdk-php or openai-php) with your API key in environment variables — never hardcode keys, and implement token usage logging from day one
📦 Applies To
PHP 8.0+ web cli queue-worker laravel symfony
🔗 Prerequisites
🔍 Detection Hints
Hardcoded API key in PHP source; no token usage tracking; no retry logic on API rate limits; synchronous AI calls blocking web requests
Auto-detectable: ✓ Yes semgrep trufflehog
⚠ Related Problems
🤖 AI Agent
Confidence: Low False Positives: Medium ✗ Manual fix Fix: High Context: File Tests: Update

✓ schema.org compliant