Retrieval System
Hybrid Search & HR-Specific Reranking
Beyond Naive Vector Search
Pure vector similarity search sounds elegant: embed the query, find the nearest vectors, return the results. But for an HR assistant, this approach fails in predictable ways:
The retrieval system solves these by combining three search strategies.
Key Concepts
The Three Search Strategies
Semantic Search — Query pgvector using cosine similarity on embeddings. Finds chunks by *meaning*, not keywords. "Maternity leave" finds "parental leave" because their embeddings are close. Best for open-ended questions.
Keyword Search — Full-text matching that catches exact terms semantic search misses. Policy IDs (POL-001), section numbers (Section 4.3), employee names (David Okonkwo), benefit plan names (Anthem Blue Cross). These are proper nouns and identifiers that embedding models often struggle with.
Structured Filters — SQL WHERE clauses on metadata. category = 'leave', applicable_states @> '{"CA"}', effective_date >= '2025-01-01'. These narrow the search space *before* similarity scoring, which is both faster and more precise.
Reciprocal Rank Fusion (RRF)
When you have results from multiple search strategies, how do you combine them? RRF is an elegant solution:
RRF_score = sum( 1 / (k + rank_in_list) ) for each list the result appears inwhere k is a constant (typically 60). A result that ranks #1 in both semantic and keyword search gets a higher combined score than one that ranks #1 in only one.
The beauty of RRF is that it doesn't require calibrating scores across different search methods — it only uses ranks, which are always comparable.
HR-Specific Reranking
After initial retrieval and fusion, results go through an HR-specific reranking pass:
Recency weighting — Newer policy versions rank higher. If both v3.0 and v4.0 of the PTO policy match, v4.0 should always win. For HR, citing an outdated policy is worse than returning no result at all.
Source authority — Official policies outrank handbook summaries, which outrank org chart data for policy questions. The authority hierarchy:
State applicability — When a query mentions "California" or "CA", results tagged with applicable_states: ["CA"] or applicable_states: ["all"] get boosted. A California-specific provision should always rank above the generic policy.
Category relevance — If the query mentions "leave" and a result's metadata says category: "leave", boost it. Simple but effective at surfacing the right policy domain.
Context Assembly
The final step before sending results to the LLM. Given a query like "What's our parental leave policy for employees in California?", the context assembly builds:
CONTEXT FOR LLM:
─────────────────
[Source: Employee Handbook — Leave & Time Off, v3.2, effective 2025-01-01]
Primary caregivers receive 16 weeks of fully paid parental leave...
California employees: additional benefits under CA-PFL may apply...
[Source: PTO Policy POL-001, v4.0, effective 2025-01-01]
California employees: PTO does not expire and is paid out upon separation
per CA Labor Code Section 227.3...
─────────────────Each piece of context carries its source attribution — policy name, section, version, effective date. This attribution flows through to the final response, enabling the employee to verify the answer against the source document.
Architecture Pattern
Query ──→ ┌─ Semantic Search ──┐
├─ Keyword Search ───┤──→ RRF Fusion ──→ Rerank ──→ Context Window
└─ SQL Filters ──────┘
│
- Recency
- Authority
- State match
- CategoryWhy This Matters for HR Compliance
An HR assistant that cites the wrong policy version or misses a state-specific provision creates real legal risk. The retrieval system's reranking rules are guardrails at the data level:
These aren't nice-to-haves. For a system that employees trust for policy guidance, they're requirements.
What You'll Build
Glossary
| Term | Meaning |
|---|---|
| Hybrid search | Combining semantic, keyword, and structured search |
| RRF | Reciprocal Rank Fusion — combines ranked lists without score calibration |
| Reranking | Adjusting result scores based on domain-specific signals |
| Context assembly | Building a structured context window for LLM consumption |
| Source attribution | Linking each piece of information to its source document |
| Recency weighting | Boosting newer documents over older ones |
This is chapter 3 of AI HR Assistant.
Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.
View course details