What is RAG?
Retrieval-Augmented Generation
Why LLMs Hallucinate
Large language models generate text by predicting the most likely next token. They don't "know" facts — they've memorized patterns from training data. When asked about your company's PTO policy or last quarter's revenue, they'll produce a confident-sounding answer that's completely made up. This is hallucination, and it's the #1 reason enterprises can't just drop an LLM into production.
The core problem: the model's knowledge is frozen at training time. It doesn't know what happened yesterday, can't read your internal docs, and has no way to say "I don't have that information."
The Retrieval + Generation Pattern
RAG solves this with a simple two-step architecture:
┌─────────────────────────────────────────────────────┐
│ User Question │
└──────────────────────┬──────────────────────────────┘
│
▼
┌────────────────┐
│ 1. RETRIEVE │ Search your documents
│ relevant │ for matching content
│ documents │
└───────┬────────┘
│
▼
┌────────────────┐
│ 2. GENERATE │ Feed retrieved docs
│ an answer │ to the LLM as context
│ with context │
└───────┬────────┘
│
▼
┌────────────────────────┐
│ Grounded Answer with │
│ Source Citations │
└────────────────────────┘Instead of asking the LLM to recall facts from memory, you retrieve the relevant documents first, then generate an answer grounded in those documents. The LLM becomes a reasoning engine over your data, not a memorization engine.
When RAG Beats Fine-tuning
| Scenario | RAG | Fine-tuning |
|---|---|---|
| Data changes frequently | Best choice — just update documents | Must retrain the model |
| Need source citations | Built-in — you know which docs were used | Model can't trace its reasoning |
| Small dataset (< 1000 docs) | Works great | Not enough data to fine-tune well |
| Domain-specific language/tone | Handles factual grounding | Better for style and format |
| Cost | Embedding once + retrieval per query | Training costs + inference costs |
Rule of thumb: Use RAG when the model needs to *know* things. Use fine-tuning when the model needs to *behave* a certain way. Many production systems use both.
Real-World Examples
What You'll Build
In this course, you'll build a complete RAG pipeline over company documents:
By Module 6, you'll have a working Q&A system that answers questions about your company handbook with cited sources.
This is chapter 1 of RAG in 60 Minutes.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details