When should you NOT use Using AI APIs in PHP?

Do not call LLM APIs synchronously inside a web request handler — it ties up PHP-FPM workers for the full response duration. Avoid sending sensitive PII or credentials in prompts — treat every API call as potentially logged by the provider. Do not trust LLM-generated code or SQL without review — use it as a draft, not as production-ready output.

When is Using AI APIs in PHP the right choice?

Dispatch LLM API calls to a queue worker — responses take 2–30 seconds and must not block a synchronous web request. Cache LLM responses for identical or near-identical prompts to reduce cost and latency on repeated queries. Validate and sanitise LLM output before using it in SQL queries, HTML output, or file operations.

← Back to glossary

Using AI APIs in PHP

ai_ml PHP 8.0+ Intermediate

debt(d5/e7/b7/t7)

d5 Detectability Operational debt — how invisible misuse is to your safety net

Closest to 'specialist tool catches it' (d5). Semgrep can detect hardcoded API keys and missing retry logic via code patterns; TruffleHog catches leaked secrets. However, architectural issues like synchronous LLM calls in web handlers or missing token budgets require more nuanced detection — some patterns are caught by these tools, others need code review. Splitting the difference at d5.

e7 Effort Remediation debt — work required to fix once spotted

Closest to 'cross-cutting refactor across the codebase' (e7). While the quick_fix mentions using an SDK with env vars (a simple change), the real misuses — synchronous calls blocking web requests, missing queue infrastructure, no caching layer, no fallback/degradation strategy, no token budget tracking — require introducing queue workers, caching layers, circuit breakers, and cost monitoring across the application. Moving from synchronous inline AI calls to a proper async queue-based architecture touches controllers, job classes, response handlers, and frontend polling/webhook infrastructure.

b7 Burden Structural debt — long-term weight of choosing wrong

Closest to 'strong gravitational pull' (b7). AI integration architecture choices (sync vs async, caching strategy, cost management, fallback behavior) shape how features are built across web, CLI, and queue-worker contexts. Once you pick an approach to AI integration, every AI-powered feature in the system follows that pattern. The choice of queue infrastructure, caching strategy, and error handling for AI calls becomes load-bearing across the system. Not quite b9 (it doesn't define the entire system's shape), but it strongly influences how AI features are developed going forward.

t7 Trap Cognitive debt — how counter-intuitive correct behaviour is

Closest to 'serious trap — contradicts how a similar concept works elsewhere' (t7). The misconception is explicit: developers assume AI API calls work like any other REST API call — make a synchronous request in the web handler, get a response, render it. This mental model works for most APIs (sub-200ms responses) but fails catastrophically for LLM calls (2-30 seconds). A competent PHP developer with no AI integration experience will almost certainly build it synchronously first, causing timeouts, blocked workers, and poor UX. The 'obvious' approach (treat it like any other HTTP API call) leads directly to production problems.

About DEBT scoring → scored by claude-opus-4-6 · 2026-05-06 · reviewed by human

Also Known As

OpenAI PHP Anthropic PHP LLM integration AI API

TL;DR

Integrating LLM APIs (OpenAI, Anthropic, Gemini) into PHP applications — for text generation, classification, extraction, and embedding-based search.

Explanation

PHP applications call AI APIs via HTTP — the openai-php/client and anthropic-sdk-php packages provide type-safe wrappers. Key patterns: text generation (completion), structured extraction (function calling/JSON mode), classification, embedding generation, and streaming responses. Important considerations: rate limiting, retry logic, cost management (token counting), timeout handling, and not blocking the web request for slow AI calls (offload to queues). PHP's synchronous model means long AI calls should be async via job queues.

Diagram

flowchart LR
    PHP[PHP Application] --> SDK[Anthropic OpenAI SDK<br/>or HTTP client]
    SDK -->|API call| LLM2[LLM API]
    LLM2 -->|response| PARSE[Parse response]
    subgraph Patterns
        SYNC2[Synchronous<br/>await full response]
        STREAM[Streaming<br/>chunk by chunk output]
        TOOL[Tool use<br/>function calling]
        RAG3[RAG<br/>inject context from DB]
    end
    subgraph Caching
        PROMPT_CACHE[Cache identical prompts<br/>save cost and latency]
        SEMANTIC[Semantic cache<br/>similar prompts hit cache]
    end
style SDK fill:#1f6feb,color:#fff
style STREAM fill:#238636,color:#fff
style PROMPT_CACHE fill:#d29922,color:#fff

Watch Out

⚠ PHP-FPM has a fixed worker pool — a single LLM call blocking for 10 seconds under moderate traffic can exhaust all workers and take down the entire application.

Common Misconception

✗ AI API calls should happen synchronously in web requests — LLM calls take 1-30 seconds; web requests should queue the job and return immediately, polling or webhooking for the result.

Why It Matters

PHP developers need practical patterns for AI integration — the architecture choices (sync vs async, caching, cost management) determine whether an AI feature is production-viable.

Common Mistakes

Synchronous LLM calls in web request handlers — 10-second AI calls time out or block workers.
No token budget — uncapped prompts with large context windows generate unexpected API costs.
Not caching identical prompts — same question asked repeatedly incurs cost each time.
No fallback when API is unavailable — AI features should degrade gracefully, not cause 500 errors.

Avoid When

Do not call LLM APIs synchronously inside a web request handler — it ties up PHP-FPM workers for the full response duration.
Avoid sending sensitive PII or credentials in prompts — treat every API call as potentially logged by the provider.
Do not trust LLM-generated code or SQL without review — use it as a draft, not as production-ready output.

When To Use

Dispatch LLM API calls to a queue worker — responses take 2–30 seconds and must not block a synchronous web request.
Cache LLM responses for identical or near-identical prompts to reduce cost and latency on repeated queries.
Validate and sanitise LLM output before using it in SQL queries, HTML output, or file operations.

Code Examples

💡 NoteThe bad example calls the LLM API synchronously inside a controller, blocking the worker; the fix dispatches to a queue job and returns immediately with a job ID for polling.

✗ Vulnerable

// Synchronous in web request — blocks worker for 10+ seconds:
class ArticleController {
    public function summarise(Request $req): Response {
        $summary = $this->openai->chat->completions->create([
            'model' => 'gpt-4',
            'messages' => [['role' => 'user', 'content' => $req->article]],
        ]); // Blocks for 5-15 seconds — worker unavailable for other requests
        return response()->json(['summary' => $summary->choices[0]->message->content]);
    }
}

✓ Fixed

// Queue-based — returns immediately, processes async:
class ArticleController {
    public function summarise(Request $req): Response {
        $jobId = Str::uuid();
        SummariseArticleJob::dispatch($req->article, $jobId);
        return response()->json(['job_id' => $jobId, 'status' => 'processing']);
    }
}
// Client polls: GET /api/jobs/{jobId} or uses websocket for push notification

Using AI APIs in PHP

Also Known As

TL;DR

Explanation

Diagram

Watch Out

Common Misconception

Why It Matters

Common Mistakes

Avoid When

When To Use

Code Examples

References

Tags

Using AI APIs in PHP

Also Known As

TL;DR

Explanation

Diagram

Watch Out

Common Misconception

Why It Matters

Common Mistakes

Avoid When

When To Use

Code Examples

References

Tags

Related Terms