AI Gateway
Guardrails, Routing & LangGraph.js
Why an AI Gateway?
In Module 3, you built a retrieval system that finds the right HR documents. But between the retrieval system and the employee's screen sits a critical layer: the AI Gateway. This is the control plane that decides:
For a generic chatbot, you might skip most of this. For an HR assistant, every one of these is mandatory. The gateway is what makes the system trustworthy enough for employees to rely on.
HR-Specific Guardrails
This is where an HR assistant fundamentally differs from other RAG systems. The guardrails aren't optional — they're the reason the system can be deployed at all.
No Salary Disclosure
An employee asks: "How much does Sarah in Engineering make?" The system has access to compensation data (or could infer it from org chart seniority). Without guardrails, it might answer. With them:
→ Detect: query references specific employee compensation
→ Block: "I can help with compensation bands and structures.
For specific salary information, please contact your
HR Business Partner."This isn't a bug — it's the system correctly protecting confidential information.
No Legal Advice
An employee asks: "Can they fire me for taking FMLA leave? That's illegal, right?" The system has the FMLA policy and could explain protections. But interpreting whether a specific situation constitutes illegal retaliation is legal advice.
→ Detect: query asks for legal interpretation
→ Redirect: "Our FMLA policy protects eligible employees from
retaliation for taking qualified leave. For guidance on your
specific situation, please consult with Legal or contact
the ethics hotline at 1-800-555-0188."The system shares the policy but draws the line at legal interpretation.
Confidentiality Enforcement
Not all employees should see all data. The gateway enforces role-based access:
| User Role | Access Level |
|---|---|
| Employee | Public + Internal (handbook, policies, benefits, org chart) |
| Manager | Above + team PTO balances, team org data |
| HR Team | All data including confidential (all PTO, all org) |
| HR Admin | All data including restricted (compensation, investigations) |
This is enforced at the gateway level, not the UI level. Even if someone crafts a clever prompt, the gateway filters results before they reach the LLM.
PII Detection
Queries containing Social Security numbers, bank account numbers, or other PII are intercepted. The PII is masked before the query proceeds to the LLM, and the employee is warned about sharing sensitive information in chat.
LangGraph.js Architecture
We implement the gateway as a LangGraph.js state graph — a directed graph where each node is a function and edges are conditional transitions.
Why LangGraph?
Instead of nested if/else chains that become unmaintainable:
// BAD: Spaghetti orchestration
if (isCached(query)) return cached;
if (hasPII(query)) query = maskPII(query);
if (isSensitive(query)) { /* special handling */ }
if (isSimple(query)) model = "haiku";
// ... 200 more lines of branching logicLangGraph makes the flow visible and composable:
classify → cache_check → [hit] → return
→ [miss] → guardrails → route → llm → format → trackEach node is a pure function. Edges are conditional. You can trace exactly which path a query took. Adding a new step (say, sentiment analysis) means adding one node and two edges — nothing else changes.
The State Object
Every node reads and updates a typed state:
interface GatewayState {
query: string;
category: "policy" | "benefits" | "org" | "leave" | "compliance" | "general";
complexity: "simple" | "moderate" | "complex";
sensitivity: "normal" | "sensitive" | "restricted";
userRole: "employee" | "manager" | "hr" | "hr_admin";
retrievedContext: SearchResult[];
cachedResponse?: string;
guardrailFlags: string[];
selectedModel: string;
response: string;
citations: Citation[];
confidence: "high" | "medium" | "low";
tokensUsed: number;
latencyMs: number;
}This state is the single source of truth. The classify node sets category and complexity. The guardrails node may add to guardrailFlags. The route node reads complexity and sensitivity to set selectedModel. Every decision is traceable.
Query Classification
The classify node categorizes each query:
| Category | Example Query | Routing Implication |
|---|---|---|
| Policy | "What's the remote work policy?" | Search policies + handbook |
| Benefits | "What's the 401k match?" | Search benefits guide |
| Leave | "How much PTO do I have?" | Search PTO + leave policies |
| Org | "Who reports to David?" | Search org chart |
| Compliance | "How do I report harassment?" | Search policies + flag as sensitive |
| General | "When is open enrollment?" | Search all sources |
Classification determines which sources to search, which model to use, and which guardrails to apply.
Model Routing
Not every query needs the same model. Simple lookups are fast and cheap; complex compliance questions need the best available model.
| Complexity | Model | Cost | Use Case |
|---|---|---|---|
| Simple | Haiku | $0.25/M tokens | "Who's my manager?" |
| Moderate | Sonnet | $3/M tokens | "Explain our PTO carryover rules" |
| Complex | Opus | $15/M tokens | "Compare our CA vs TX leave provisions" |
This routing typically saves 60-70% on token costs compared to using the most capable model for every query.
Semantic Caching
Policy questions are highly repetitive. "What's the 401k match?" gets asked hundreds of times with slight variations. Semantic caching detects these:
Cache hit rates of 30-50% are common for HR systems, which dramatically reduces cost and latency.
What You'll Build
Glossary
| Term | Meaning |
|---|---|
| AI Gateway | Control plane between retrieval and LLM |
| LangGraph | Graph-based AI orchestration framework |
| State graph | Directed graph where nodes are functions and edges are transitions |
| Guardrail | Rule that blocks, redirects, or modifies queries/responses |
| Model routing | Selecting the right LLM based on query characteristics |
| Semantic caching | Caching responses for semantically similar queries |
| RBAC | Role-Based Access Control — permissions tied to user roles |
This is chapter 4 of AI HR Assistant.
Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.
View course details