Back to guides
5
5 min

Companion App

Chat + Citations

From Pipeline to Product

Modules 1-4 built the engine — data ingestion, encoding, retrieval, and the AI gateway. Now we wrap it in a product that sales reps actually want to use. The Companion App is a chat interface with one killer feature: pre-call briefings that pull together everything the system knows about an account.

Key Concepts

Streaming Responses

Users expect AI to start responding immediately, not wait 10 seconds for a complete answer. Streaming sends tokens as they're generated:

User: "Brief me on Acme Corp"

[200ms] "Acme Corp is a "
[250ms] "mid-market SaaS company "
[300ms] "currently in the negotiation stage..."

Technically, this is Server-Sent Events (SSE) — the server pushes chunks to the client over a single HTTP connection. The client renders each chunk as it arrives, creating the typewriter effect.

The Briefing Agent (LangGraph.js)

A pre-call briefing needs data from multiple sources — CRM, transcripts, tickets, competitors — and each search is independent. Running them sequentially is slow. LangGraph.js enables a fan-out pattern:

                   ┌──→ Fetch CRM Data ──────┐
                   │                          │
Parse Request ──→  ├──→ Fetch Call History ───┤──→ Synthesize ──→ Format
                   │                          │
                   ├──→ Fetch Support Tickets ┤
                   │                          │
                   └──→ Fetch Competitor Intel ┘

All four fetches run in parallel. The synthesize node waits for all of them, then combines the results into a coherent briefing. What would take 8 seconds sequentially takes 2 seconds with fan-out.

Graph Composition

The briefing agent graph composes with the gateway graph from Module 4. When a user asks for a briefing:

  • The gateway classifies it as a "briefing" request
  • Routes to the briefing agent graph (instead of simple retrieval)
  • The briefing graph fans out, retrieves, synthesizes
  • The result flows back through the gateway's guardrails and tracking
  • This is the power of graph-based orchestration — complex workflows compose without becoming spaghetti code.

    Source Citations

    Enterprise users don't trust AI that says "trust me." Every claim in a response must cite its source:

    Acme's main concern is the integration timeline [1].
    Their VP of Engineering, Jane Smith, mentioned needing
    completion before Q2 board meeting [2]. There's also an
    open support ticket about API latency [3].
    
    Sources:
    [1] CRM - Acme Corp (updated 2024-03-15)
    [2] Call Transcript - Acme Q4 Review (2024-02-28)
    [3] Support Ticket #1847 (2024-03-01)

    Implementation: the context assembly step (Module 3) tags each chunk with a source reference. The AI prompt instructs the model to cite these references. Post-processing maps citation numbers to full source details.

    Confidence Scores

    Not all AI responses are equally reliable. Confidence scoring signals to the user when to trust and when to verify:

  • High confidence — multiple sources agree, recent data, exact matches
  • Medium confidence — single source, some inference required
  • Low confidence — no direct sources, mostly AI reasoning
  • This is computed by analyzing: how many source chunks were used, how similar they were to the query, how recent the data is, and whether multiple sources corroborate the answer.

    Graph Execution Trace

    For debugging and transparency, every graph execution produces a trace:

    {
      "trace_id": "brief-acme-001",
      "nodes_executed": ["parse", "fetch_crm", "fetch_calls", "fetch_tickets", "fetch_competitors", "synthesize", "format"],
      "duration_ms": 2340,
      "tokens_used": { "input": 3200, "output": 890 },
      "sources_retrieved": 12,
      "sources_cited": 3
    }

    This trace is invaluable for debugging ("why did it miss that ticket?"), optimization ("which fetch is slowest?"), and auditing ("what data did it access?").

    What You'll Build

  • Next.js chat interface with SSE streaming
  • LangGraph briefing agent with parallel data fetching
  • Inline source citations with confidence scores
  • Graph execution trace for debugging
  • Glossary

    TermMeaning
    SSEServer-Sent Events — server pushes data to client over HTTP
    Fan-outRunning multiple independent operations in parallel
    Briefing agentA graph that gathers multi-source data into one summary
    CitationA reference linking an AI claim to its source data
    Confidence scoreA signal of how well-supported an AI response is
    Execution traceA log of every step the graph executed and its timing
    Graph compositionNesting one graph inside another as a single node

    This is chapter 5 of AI Sales Companion.

    Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.

    View course details