Add Citations
Building Trust in AI Answers
Why Citations Matter
An AI answer without sources is just a claim. An AI answer with citations is verifiable. In enterprise settings, this isn't optional — employees need to know whether the answer came from the official handbook or a random Slack message.
Without citations:
"You get 15 days of PTO per year."
→ Is this right? Where did it come from? Can I trust it?
With citations:
"You get 15 days of PTO per year. [handbook.md, PTO Policy]"
→ I can check the source. I trust this answer.Citations transform your RAG system from a chatbot into a research assistant that shows its work.
Source Attribution
Every chunk already carries metadata from the chunking step — source file, section heading, chunk index. Now you surface this in the answer.
The LLM naturally produces citations when you label context blocks with source numbers:
Input context:
[Source 1: handbook.md, PTO Policy]
Employees receive 15 days of paid time off per year...
[Source 2: handbook.md, Holidays]
The company observes 10 federal holidays...
LLM output:
"You get 15 days of PTO per year [Source 1], plus 10 federal
holidays [Source 2], for a total of 25 days off."The key: your prompt says "Always cite which source(s) you used." The numbered source labels make it easy for the LLM to reference specific documents.
Inline Citations
Parse the LLM's output to turn [Source 1] references into rich citations:
interface Citation {
sourceNumber: number;
fileName: string;
heading?: string;
snippet: string; // first 100 chars of the chunk
}
function extractCitations(
answer: string,
chunks: SearchResult[]
): { text: string; citations: Citation[] } {
const cited = new Set<number>();
// Find all [Source N] references in the answer
const pattern = /\[Source (\d+)\]/g;
let match;
while ((match = pattern.exec(answer)) !== null) {
cited.add(parseInt(match[1]) - 1); // 0-indexed
}
const citations = [...cited].map((i) => ({
sourceNumber: i + 1,
fileName: chunks[i].source,
heading: chunks[i].heading,
snippet: chunks[i].content.slice(0, 100) + "...",
}));
return { text: answer, citations };
}In a web UI, you'd render citations as clickable footnotes that expand to show the source passage.
Confidence Scores
Not all answers deserve equal confidence. Combine retrieval similarity scores into an overall confidence:
function computeConfidence(chunks: SearchResult[]): {
score: number;
level: "high" | "medium" | "low";
} {
if (chunks.length === 0) return { score: 0, level: "low" };
// Average similarity of top chunks
const avgSimilarity =
chunks.reduce((sum, c) => sum + c.similarity, 0) / chunks.length;
// Top chunk similarity matters most
const topSimilarity = chunks[0].similarity;
// Weighted: 60% top chunk, 40% average
const score = topSimilarity * 0.6 + avgSimilarity * 0.4;
const level =
score >= 0.85 ? "high" :
score >= 0.70 ? "medium" : "low";
return { score, level };
}Display confidence to users:
Highlighting Source Passages
The final touch: show users the exact passage that supports the answer. This is the "show your work" moment.
interface AnswerWithSources {
answer: string;
confidence: { score: number; level: string };
sources: {
file: string;
heading?: string;
passage: string; // the full chunk text
similarity: number; // how well it matched the question
}[];
}In a UI, render this as:
[1] [2] footnote markersWhat You've Built
Congratulations — you now have a complete RAG pipeline:
Documents → Chunk → Embed → Store (pgvector)
↓
Question → Embed → Search → Retrieve top chunks
↓
Prompt template + chunks → LLM → Answer with citationsThis is the same architecture behind ChatGPT's web browsing, Notion AI, and every enterprise Q&A system. The difference between a demo and production is scale, monitoring, and evaluation — but the core pattern is exactly what you've built here.
This is chapter 6 of RAG in 60 Minutes.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details