3 min

Add Citations

Building Trust in AI Answers

Why Citations Matter

An AI answer without sources is just a claim. An AI answer with citations is verifiable. In enterprise settings, this isn't optional — employees need to know whether the answer came from the official handbook or a random Slack message.

Without citations:
  "You get 15 days of PTO per year."
  → Is this right? Where did it come from? Can I trust it?

With citations:
  "You get 15 days of PTO per year. [handbook.md, PTO Policy]"
  → I can check the source. I trust this answer.

Citations transform your RAG system from a chatbot into a research assistant that shows its work.

Source Attribution

Every chunk already carries metadata from the chunking step — source file, section heading, chunk index. Now you surface this in the answer.

The LLM naturally produces citations when you label context blocks with source numbers:

Input context:
  [Source 1: handbook.md, PTO Policy]
  Employees receive 15 days of paid time off per year...

  [Source 2: handbook.md, Holidays]
  The company observes 10 federal holidays...

LLM output:
  "You get 15 days of PTO per year [Source 1], plus 10 federal
   holidays [Source 2], for a total of 25 days off."

The key: your prompt says "Always cite which source(s) you used." The numbered source labels make it easy for the LLM to reference specific documents.

Inline Citations

Parse the LLM's output to turn [Source 1] references into rich citations:

interface Citation {
  sourceNumber: number;
  fileName: string;
  heading?: string;
  snippet: string;  // first 100 chars of the chunk
}

function extractCitations(
  answer: string,
  chunks: SearchResult[]
): { text: string; citations: Citation[] } {
  const cited = new Set<number>();

  // Find all [Source N] references in the answer
  const pattern = /\[Source (\d+)\]/g;
  let match;
  while ((match = pattern.exec(answer)) !== null) {
    cited.add(parseInt(match[1]) - 1); // 0-indexed
  }

  const citations = [...cited].map((i) => ({
    sourceNumber: i + 1,
    fileName: chunks[i].source,
    heading: chunks[i].heading,
    snippet: chunks[i].content.slice(0, 100) + "...",
  }));

  return { text: answer, citations };
}

In a web UI, you'd render citations as clickable footnotes that expand to show the source passage.

Confidence Scores

Not all answers deserve equal confidence. Combine retrieval similarity scores into an overall confidence:

function computeConfidence(chunks: SearchResult[]): {
  score: number;
  level: "high" | "medium" | "low";
} {
  if (chunks.length === 0) return { score: 0, level: "low" };

  // Average similarity of top chunks
  const avgSimilarity =
    chunks.reduce((sum, c) => sum + c.similarity, 0) / chunks.length;

  // Top chunk similarity matters most
  const topSimilarity = chunks[0].similarity;

  // Weighted: 60% top chunk, 40% average
  const score = topSimilarity * 0.6 + avgSimilarity * 0.4;

  const level =
    score >= 0.85 ? "high" :
    score >= 0.70 ? "medium" : "low";

  return { score, level };
}

Display confidence to users:

High (0.85+): Strong match — the answer is well-supported

Medium (0.70-0.85): Decent match — review the sources

Low (below 0.70): Weak match — the system isn't sure

Highlighting Source Passages

The final touch: show users the exact passage that supports the answer. This is the "show your work" moment.

interface AnswerWithSources {
  answer: string;
  confidence: { score: number; level: string };
  sources: {
    file: string;
    heading?: string;
    passage: string;     // the full chunk text
    similarity: number;  // how well it matched the question
  }[];
}

In a UI, render this as:

The answer text with inline [1] [2] footnote markers

A confidence badge (green/yellow/red)

An expandable "Sources" section showing each referenced passage

What You've Built

Congratulations — you now have a complete RAG pipeline:

Documents → Chunk → Embed → Store (pgvector)
                                    ↓
Question → Embed → Search → Retrieve top chunks
                                    ↓
              Prompt template + chunks → LLM → Answer with citations

This is the same architecture behind ChatGPT's web browsing, Notion AI, and every enterprise Q&A system. The difference between a demo and production is scale, monitoring, and evaluation — but the core pattern is exactly what you've built here.

This is chapter 6 of RAG in 60 Minutes.

Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

View course details

Ch. 5: Generate Answers