Chain-of-Thought
Making LLMs Show Their Work
Why "Think Step by Step" Works
When you ask an LLM a complex question directly, it jumps to an answer using a single prediction pass. But when you ask it to reason through steps, each intermediate step becomes context for the next — giving the model more signal to work with.
# Without Chain-of-Thought
Q: A company has 3 departments. Sales has 12 people, Engineering has 28, and Marketing has 8. If the company adds 15% more people to each department, how many total employees will there be?
A: 55 (wrong — it often skips steps)
# With Chain-of-Thought
Q: A company has 3 departments. Sales has 12 people, Engineering has 28, and Marketing has 8. If the company adds 15% more people to each department, how many total employees will there be? Think through this step by step.
A: Let me work through this:
1. Sales: 12 × 1.15 = 13.8 → 14 people
2. Engineering: 28 × 1.15 = 32.2 → 32 people
3. Marketing: 8 × 1.15 = 9.2 → 9 people
4. Total: 14 + 32 + 9 = 55 peopleThe magic phrase "think step by step" is the simplest form of chain-of-thought (CoT) prompting. But you can do much more.
Structured Chain-of-Thought
Instead of a vague "think step by step," give the model a specific reasoning framework:
Analyze this customer review and determine the appropriate response action.
Review: "The product itself is great but your shipping took 3 weeks and the box arrived damaged. I had to call support twice before getting a replacement."
Please reason through:
1. SENTIMENT: What is the overall sentiment? What specific aspects are positive vs negative?
2. ISSUES: List each distinct issue mentioned.
3. SEVERITY: Rate each issue (low/medium/high) based on customer impact.
4. ROOT CAUSE: What likely caused each issue?
5. ACTION: What specific response should we take?This produces dramatically better analysis than "What should we do about this review?"
When CoT Helps vs. Hurts
CoT Helps
CoT Hurts
Decomposing Complex Tasks
The most powerful CoT technique: break a big task into explicit sub-tasks.
You are analyzing a business invoice. Complete these steps in order:
STEP 1 — EXTRACT: Pull out invoice_id, vendor, line_items, subtotal, tax, total
STEP 2 — VALIDATE: Check if the line items sum to the subtotal. Flag any discrepancies.
STEP 3 — CLASSIFY: Categorize the expense (software, services, hardware, travel, other)
STEP 4 — FLAG: Note anything unusual (duplicate invoice number, amount over $10K, vendor not in approved list)
STEP 5 — SUMMARY: One-line summary suitable for an expense report
Invoice data:
[paste invoice here]Each step builds on the previous one. The model can't skip ahead because each step's output feeds the next.
Reasoning Traces for Debugging
Chain-of-thought isn't just for accuracy — it's for debuggability. When a model gives a wrong answer with CoT, you can see exactly where the reasoning went wrong:
Step 1: Extracted vendor as "Acme Corp" ✓
Step 2: Calculated subtotal as $4,500 ✗ (actual: $4,200 — model misread a line item)
Step 3: Classified as "software" based on wrong subtotalWithout CoT, you just get a wrong answer with no way to diagnose it.
Practice Tasks
Using the data in your project:
data/emails.json — extract sentiment, urgency, required action, and suggested responsedata/invoices.csv — extract, validate totals, classify expenses, flag anomaliesdata/reviews.jsonKey Takeaways
This is chapter 3 of Prompt Engineering Essentials.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details