How LLMs Think
Tokens, Context & Probability
What Happens When You Type a Prompt
Large Language Models don't "understand" language the way you do. They predict the most likely next token — a chunk of text, usually 3-4 characters — based on everything that came before it. Every response is a chain of thousands of these probability calculations.
This matters because once you understand the prediction engine, you can steer it.
Tokens: The Atoms of Language
LLMs don't see words. They see tokens — fragments of text from a fixed vocabulary of ~100,000 entries. "Artificial intelligence" is two tokens. "AI" is one. "Pneumonoultramicroscopicsilicovolcanoconiosis" is nine.
Why this matters for prompting:
Input: "Summarize this quarterly report"
Tokens: ["Sum", "mar", "ize", " this", " quarterly", " report"]Temperature: Controlling Randomness
When the model predicts the next token, it generates probabilities across the entire vocabulary. Temperature controls how it picks from those probabilities:
| Temperature | Behavior | Best For |
|---|---|---|
| 0.0 | Always picks the highest-probability token | Factual extraction, classification |
| 0.3–0.7 | Slight variation, mostly predictable | Business writing, analysis |
| 0.8–1.0 | More creative, occasionally surprising | Brainstorming, creative copy |
What LLMs Are Good At
What LLMs Are Bad At
The Mental Model
Think of an LLM as an extremely well-read autocomplete engine. It has read billions of documents and learned patterns about how text follows other text. Your job as a prompt engineer is to set up the right context so the autocomplete engine produces exactly what you need.
The next five modules will teach you specific techniques to do this — starting with the simplest approach (just ask) and building up to structured, reusable prompt systems.
Key Takeaways
This is chapter 1 of Prompt Engineering Essentials.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details