Memory & Context
Conversations That Remember
The Memory Problem
Every Claude API call is stateless. Claude doesn't remember your last conversation, your preferences, or even your name. The messages array you send *is* the memory — and it's your job to manage it.
Conversation History
The simplest memory: keep a growing array of messages and send it with every API call.
Turn 1: messages = [user: "What meetings do I have?"]
Turn 2: messages = [user: "What meetings?", assistant: "You have 3...", user: "Cancel the 2pm"]
Turn 3: messages = [user, assistant, user, assistant, user: "Thanks"]Claude sees the full conversation and responds in context. "Cancel the 2pm" works because Claude can see the earlier list of meetings.
The Context Window Limit
The messages array can't grow forever. Claude's context window is large (200K tokens) but not infinite. A 60-turn conversation with tool calls can easily reach 50K+ tokens.
| Strategy | How It Works | Trade-off |
|---|---|---|
| Truncation | Drop oldest messages | Loses early context |
| Sliding window | Keep last N messages | Predictable but lossy |
| Summarization | Summarize old messages, keep recent | Best balance |
Summarization is the best approach for assistants:
This preserves key facts while staying within token limits.
User Preferences
A preferences file makes the assistant personal:
{
"name": "Alex",
"role": "Engineering Manager",
"communication_style": "concise, direct",
"priority_rules": {
"critical": ["production", "outage", "security"],
"low": ["newsletter", "social", "digest"]
},
"working_hours": "9am-6pm EST"
}Load this into the system prompt: "The user's name is Alex. They prefer concise, direct communication. They work 9am-6pm EST."
Preferences affect everything:
Contact Awareness
Loading contacts.csv into the assistant's context changes how it talks about people:
Without contacts: "You have an email from sarah.chen@globex.com"
With contacts: "You have an email from Sarah Chen, VP of Partnerships at Globex — she's your main contact for the analytics integration"
The trick is loading contacts into the system prompt so Claude always has them available, without the user needing to ask.
Persistence Between Sessions
For a local CLI assistant, you can save conversation summaries and preferences to disk:
| What to Persist | Where | Format |
|---|---|---|
| User preferences | `config/preferences.json` | JSON |
| Conversation summary | `config/history.json` | JSON with summary + key facts |
| Contact context | `data/contacts.csv` | Loaded at startup |
On startup, load the previous session's summary and inject it: "In our last conversation, we discussed the Q3 budget review and you asked me to follow up on the Globex proposal."
Memory Changes Everything
Without memory, every conversation starts cold:
> "Hi, I'm your assistant. How can I help?"
With memory:
> "Good morning, Alex. You have 3 unread emails — one critical alert from the monitoring system. Your first meeting is at 10am with Sarah Chen about the Globex integration. Want me to start with the alert?"
That's the difference between a chatbot and an assistant.
Key Takeaways
This is chapter 5 of Build Your AI Assistant with Claude.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details