Back to guides
3
5 min

Retrieval System

Hybrid Search with Financial Awareness

Why Financial Retrieval Is Hard

General-purpose RAG retrieval treats every query the same: embed the question, find similar chunks, return the top K. For financial data, this approach fails in three critical ways.

Problem 1: Exact Financial Terms

A query for "NVDA gross margin Q3 2024" requires exact matching on the ticker (NVDA), the metric (gross margin), and the period (Q3 2024). Semantic search might return AMD's margins or NVDA's Q2 data because the embeddings are similar. You need keyword search and structured filters alongside vector similarity.

Problem 2: Multi-Company Comparisons

"Compare margins across our top 5 competitors" requires retrieving data from 5 different companies, all for the same time period. Pure vector search returns the most semantically similar chunks, which might be 5 chunks from the same company. You need diversity-aware retrieval that ensures each company is represented.

Problem 3: Numerical Reasoning

"Which company has the highest operating margin?" requires extracting numbers from retrieved chunks and comparing them. The retrieval system needs to surface chunks with actual numerical data — not just narrative descriptions of margins.

Hybrid Search Architecture

The retrieval system combines three search strategies:

Semantic Search (Vector Similarity)

Query pgvector using cosine similarity on embeddings. This excels at finding thematically relevant content:

  • "concerns about supply chain" finds earnings call discussions about logistics
  • "competitive pressure" finds MD&A sections discussing market competition
  • "growth opportunities" finds analyst notes discussing catalysts
  • Keyword Search (Full-Text)

    PostgreSQL ts_vector/ts_query catches exact terms that semantic search might miss:

  • Ticker symbols: NVDA, AAPL, MSFT
  • Financial metrics: "gross margin", "EBITDA", "free cash flow"
  • Filing references: "10-K", "Q3 2024"
  • Executive names: "Jensen Huang", "CFO"
  • Structured Filters (SQL WHERE)

    Direct database queries for structured attributes:

  • ticker = 'NVDA' — specific company
  • fiscal_period = 'Q3-2024' — specific time period
  • filing_type = '10-K' — annual filings only
  • source_type = 'transcript' — earnings calls only
  • is_table = true — financial tables only
  • section_name = 'Risk Factors' — specific filing section
  • Reciprocal Rank Fusion (RRF)

    Combine results from all three strategies using RRF, which is robust to different score scales:

    RRF_score(doc) = sum(1 / (k + rank_in_list)) for each list containing doc

    With k=60 (standard), a document ranked #1 in both semantic and keyword gets a combined score of 1/61 + 1/61 = 0.033, while a document ranked #1 in semantic but absent from keyword gets only 0.016. This naturally boosts documents that match on multiple dimensions.

    Financial tuning: Weight keyword search higher (1.2x) for queries containing ticker symbols or specific metric names. Weight semantic higher (1.2x) for thematic queries without explicit tickers.

    Numerical-Aware Reranking

    After RRF fusion, apply financial-specific reranking:

    Recency Boost

    Newer data is more relevant for most financial queries. A Q3 2024 filing should rank above a Q1 2024 filing when no specific period is requested. The decay function:

    recency_score = max(0, 1 - days_old / 180)

    This gives full weight to data from the last 6 months, linearly decaying to zero for older data.

    Source Authority

    Different sources carry different weight depending on the query type:

    Query TypeHighest AuthorityLowest Authority
    Factual (revenue, margins)SEC FilingsAnalyst Notes
    Outlook (guidance, forecasts)Earnings TranscriptsInternal Reports
    Opinion (ratings, targets)Analyst NotesSEC Filings
    Competitive (positioning)Internal ReportsMarket Data

    Numerical Density

    For quantitative queries ("compare margins", "revenue growth"), boost chunks with higher numerical density. Count occurrences of patterns like $X.XM, XX.X%, $X,XXX and use them as a ranking signal.

    Period Matching

    If the query mentions a specific period ("Q3 2024"), boost chunks whose fiscal_period metadata matches exactly. This prevents returning Q2 data when Q3 was explicitly requested.

    Diversity Enforcement

    For comparison queries, ensure the top results include data from multiple companies. If the top 5 results are all NVDA, demote duplicates and promote other tickers until at least 3 companies are represented.

    Context Assembly

    The final step before passing data to the LLM: assemble a context window with:

  • Financial tables first — structured data the LLM can reason about
  • Supporting narrative — MD&A commentary that explains the numbers
  • Analyst perspective — external viewpoints on the data
  • Source attribution — every piece tagged with [Ticker | Period | Source | Section]
  • Budget management matters: with a 128K context window, you could include everything. Don't. More context means more noise and higher cost. Target 4,000-6,000 tokens of focused, relevant financial data.

    Test Queries

    Build your retrieval system to handle these representative queries:

  • "Compare Q3 2024 gross margins for NVDA, AMD, and INTC"
  • "What did NVDA's CEO say about AI demand in the latest earnings call?"
  • "Which companies have price targets above current trading price?"
  • "Show me revenue trends for the semiconductor sector over the last 4 quarters"
  • This is chapter 3 of AI Finance Analyst.

    Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.

    View course details