6 min

Shared Memory

Giving Your Agents a Shared Brain

The Memory Problem

In Module 2, the supervisor passes context between agents — but only what the StateManager explicitly includes. This creates blind spots. The researcher notices the competitor pricing data looks outdated. The analyst needs that observation but never sees it because it wasn't part of the task output. The writer drafts a memo citing stale numbers.

Shared memory solves this by giving agents multiple channels for different types of information. Not everything is a task result. Not everything is a formal deliverable. Some information is informal, cross-cutting, and time-sensitive.

Three Memory Tiers

A multi-agent system needs three distinct memory tiers, each optimized for a different kind of information:

Tier 1: Conversation Store

The full message history — every communication between agents and the supervisor. Think of it as the system's chat log.

// Messages support multi-axis retrieval
store.getByAgent("researcher")     // All messages involving the researcher
store.getByTask("task-3")          // All messages about a specific task
store.getRecent(10)                // Last 10 messages (for context windows)
store.search("NovaTech pricing")   // Keyword search across all messages

The conversation store is bounded at 1000 messages. When the cap is hit, oldest messages are trimmed. This is a sliding window pattern — you always have the most recent context, and historical messages get archived to persistent storage.

Tier 2: Artifact Store

The tangible outputs of agent work — reports, memos, analyses, code, charts. Unlike messages (which are communication), artifacts are deliverables that other agents build upon.

Key features:

Versioning — When the writer revises a memo after analyst feedback, the store tracks version 1 and version 2. The assembler uses the latest; the audit trail has both.

Multi-axis retrieval — By task, by agent, by type. The assembler calls getByType("memo") to collect all memos for the final output. The reviewer calls getByTask("task-3") to see everything produced for one sub-task.

Typed structure — Artifact types are an enum (report, memo, analysis, code, chart, summary). The assembler knows exactly what types to expect and how to handle each.

Tier 3: Scratchpad

Informal notes and intermediate results. The researcher leaves a note: "The Q4 market size data in report MR-012 contradicts MR-023. May be using different definitions of 'addressable market'." The analyst sees this and adjusts its calculations.

// Write a tagged note with TTL
scratchpad.write("pricing-data-quality", {
  observation: "Competitor pricing in CP-015 may be outdated — listed as Q3 2024",
  recommendation: "Cross-reference with web search before citing"
}, "researcher", ["pricing", "data-quality"]);

// Another agent reads by tag
const pricingNotes = scratchpad.readByTag("pricing");

TTL eviction is the scratchpad's critical feature. Every entry expires after a configured duration (default 30 minutes). This prevents stale observations from poisoning future decisions. An observation about "current sprint priorities" from 2 hours ago is probably wrong.

Context Assembly

The StateManager's most important job is assembling the right context for each agent. Not everything in memory — that would blow the context window. Just what's relevant.

For a writer about to draft a memo:

Original query (always)

Researcher's findings (from artifact store — reports)

Analyst's conclusions (from artifact store — analyses)

Recent scratchpad notes tagged "writing" or "quality" (from scratchpad)

Last 5 messages involving the writer (from conversation store)

For an analyst about to benchmark:

Original query (always)

Researcher's raw data (from artifact store — reports)

Scratchpad notes tagged "data-quality" or "methodology" (from scratchpad)

Last 5 messages involving the analyst (from conversation store)

This is information scoping — each agent sees a tailored view of the shared memory, not the entire thing. This keeps prompts focused and token usage under control.

Bounded Memory Is Non-Negotiable

Every memory tier has caps:

Conversation store: 1000 messages

Scratchpad: 100 entries

Artifact store: no hard cap, but versioned (only latest matters for assembly)

Unbounded memory is a production outage waiting to happen. A conversation that runs for 200 turns with 4 agents generates 800+ messages. If you naively include all of them in every prompt, you'll hit context window limits, degrade output quality (too much noise), and blow your token budget.

The pattern: cap → evict → summarize. Cap the store at a fixed size. Evict the oldest or least-relevant entries. If you need the evicted information later, summarize it into a compact form and store the summary.

Memory Summarization

As an orchestration grows complex, even 10 recent messages may be too much context. Memory summarization compresses the history:

Before summarization (600 tokens):

[researcher → supervisor] Found 3 reports on NovaTech...
[supervisor → analyst] Assigned "Market Analysis" to analyst...
[analyst → supervisor] Completed analysis, 12.3% market share...
[supervisor → writer] Assigned "Draft Memo" to writer...
(6 more messages)

After summarization (150 tokens):

Research complete: 3 NovaTech reports found, sub-50ms latency claim, no SOC 2.
Analysis complete: 12.3% market share, 23% YoY growth, NovaTech leads at 24.5%.
Writing assigned: executive memo in progress.

The summarization is audience-aware — the writer's summary emphasizes findings and recommendations, while the analyst's summary emphasizes data points and methodology. This keeps each agent focused on what matters to its task.

This is chapter 3 of Multi-Agent Orchestration.

Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

View course details

Ch. 2: Supervisor Pattern

Ch. 4: Consensus & Handoff