AI Gateway
Route & Guard
Why Not Just Call the LLM Directly?
In a prototype, you send the user's question straight to an LLM with your retrieved context. In production, that's a recipe for:
The AI Gateway is the control plane between your users and the LLM. It classifies, routes, guards, caches, and tracks every request.
Key Concepts
Why LangGraph.js?
Traditional if/else chains work for simple routing, but AI gateways have conditional, branching logic that's hard to express linearly:
LangGraph.js models this as a state graph — each step is a node, conditions are edges. The graph is:
State Graph Architecture
┌──→ Cache Hit ──→ Return Cached ──→ Track
│
Query ──→ Classify ──→ Route ──→ Guardrails ──→ LLM ──→ Format ──→ Track
│ │ │
│ simple/complex PII check
│ greeting/search input validation
│ output check
└──→ Greeting ──→ Direct Response ──→ TrackGatewayState
The state object flows through every node, accumulating data:
interface GatewayState {
query: string;
queryType: "greeting" | "simple" | "complex" | "sensitive";
context: RetrievedChunk[];
response: string;
model: string;
cached: boolean;
cost: { inputTokens: number; outputTokens: number; usd: number };
userId: string;
guardrailFlags: string[];
}Each node reads what it needs, writes what it produces. The graph framework handles the plumbing.
Query Classification
The classify node analyzes the incoming query and determines:
This is often done with a small, fast LLM call or even a rule-based classifier for common patterns.
Guardrails
Guardrails are graph nodes that inspect inputs and outputs:
Input guardrails:
Output guardrails:
Semantic Caching
If someone asks "What's Acme's deal status?" and another rep asked the same thing 5 minutes ago, why run the full pipeline again? Semantic caching matches queries by meaning (not exact string) and returns cached responses when similarity exceeds a threshold.
Cost Controls
Per-user budgets prevent runaway spending:
What You'll Build
Glossary
| Term | Meaning |
|---|---|
| State graph | A directed graph where data flows through nodes via edges |
| LangGraph.js | Framework for building stateful AI workflows as graphs |
| Node | A processing step in the graph (classify, route, guard, etc.) |
| Conditional edge | An edge that routes to different nodes based on state |
| PII | Personally Identifiable Information (SSN, emails, phone numbers) |
| Prompt injection | Malicious input trying to override system instructions |
| Semantic cache | Cache that matches by meaning similarity, not exact strings |
| Token budget | Maximum tokens a user can consume per time period |
This is chapter 4 of AI Sales Companion.
Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.
View course details