5 min

Companion App

Chat + Citations

From Pipeline to Product

Modules 1-4 built the engine — data ingestion, encoding, retrieval, and the AI gateway. Now we wrap it in a product that sales reps actually want to use. The Companion App is a chat interface with one killer feature: pre-call briefings that pull together everything the system knows about an account.

Key Concepts

Streaming Responses

Users expect AI to start responding immediately, not wait 10 seconds for a complete answer. Streaming sends tokens as they're generated:

User: "Brief me on Acme Corp"

[200ms] "Acme Corp is a "
[250ms] "mid-market SaaS company "
[300ms] "currently in the negotiation stage..."

Technically, this is Server-Sent Events (SSE) — the server pushes chunks to the client over a single HTTP connection. The client renders each chunk as it arrives, creating the typewriter effect.

The Briefing Agent (LangGraph.js)

A pre-call briefing needs data from multiple sources — CRM, transcripts, tickets, competitors — and each search is independent. Running them sequentially is slow. LangGraph.js enables a fan-out pattern:

                   ┌──→ Fetch CRM Data ──────┐
                   │                          │
Parse Request ──→  ├──→ Fetch Call History ───┤──→ Synthesize ──→ Format
                   │                          │
                   ├──→ Fetch Support Tickets ┤
                   │                          │
                   └──→ Fetch Competitor Intel ┘

All four fetches run in parallel. The synthesize node waits for all of them, then combines the results into a coherent briefing. What would take 8 seconds sequentially takes 2 seconds with fan-out.

Graph Composition

The briefing agent graph composes with the gateway graph from Module 4. When a user asks for a briefing:

The gateway classifies it as a "briefing" request

Routes to the briefing agent graph (instead of simple retrieval)

The briefing graph fans out, retrieves, synthesizes

The result flows back through the gateway's guardrails and tracking

This is the power of graph-based orchestration — complex workflows compose without becoming spaghetti code.

Source Citations

Enterprise users don't trust AI that says "trust me." Every claim in a response must cite its source:

Acme's main concern is the integration timeline [1].
Their VP of Engineering, Jane Smith, mentioned needing
completion before Q2 board meeting [2]. There's also an
open support ticket about API latency [3].

Sources:
[1] CRM - Acme Corp (updated 2024-03-15)
[2] Call Transcript - Acme Q4 Review (2024-02-28)
[3] Support Ticket #1847 (2024-03-01)

Implementation: the context assembly step (Module 3) tags each chunk with a source reference. The AI prompt instructs the model to cite these references. Post-processing maps citation numbers to full source details.

Confidence Scores

Not all AI responses are equally reliable. Confidence scoring signals to the user when to trust and when to verify:

High confidence — multiple sources agree, recent data, exact matches

Medium confidence — single source, some inference required

Low confidence — no direct sources, mostly AI reasoning

This is computed by analyzing: how many source chunks were used, how similar they were to the query, how recent the data is, and whether multiple sources corroborate the answer.

Graph Execution Trace

For debugging and transparency, every graph execution produces a trace:

{
  "trace_id": "brief-acme-001",
  "nodes_executed": ["parse", "fetch_crm", "fetch_calls", "fetch_tickets", "fetch_competitors", "synthesize", "format"],
  "duration_ms": 2340,
  "tokens_used": { "input": 3200, "output": 890 },
  "sources_retrieved": 12,
  "sources_cited": 3
}

This trace is invaluable for debugging ("why did it miss that ticket?"), optimization ("which fetch is slowest?"), and auditing ("what data did it access?").

What You'll Build

Next.js chat interface with SSE streaming

LangGraph briefing agent with parallel data fetching

Inline source citations with confidence scores

Graph execution trace for debugging

Glossary

Term	Meaning
SSE	Server-Sent Events — server pushes data to client over HTTP
Fan-out	Running multiple independent operations in parallel
Briefing agent	A graph that gathers multi-source data into one summary
Citation	A reference linking an AI claim to its source data
Confidence score	A signal of how well-supported an AI response is
Execution trace	A log of every step the graph executed and its timing
Graph composition	Nesting one graph inside another as a single node

This is chapter 5 of AI Sales Companion.

Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.

View course details

Ch. 4: AI Gateway

Ch. 6: Deploy & Connect