5 min

Retrieval System

Hybrid Search with Financial Awareness

Why Financial Retrieval Is Hard

General-purpose RAG retrieval treats every query the same: embed the question, find similar chunks, return the top K. For financial data, this approach fails in three critical ways.

Problem 1: Exact Financial Terms

A query for "NVDA gross margin Q3 2024" requires exact matching on the ticker (NVDA), the metric (gross margin), and the period (Q3 2024). Semantic search might return AMD's margins or NVDA's Q2 data because the embeddings are similar. You need keyword search and structured filters alongside vector similarity.

Problem 2: Multi-Company Comparisons

"Compare margins across our top 5 competitors" requires retrieving data from 5 different companies, all for the same time period. Pure vector search returns the most semantically similar chunks, which might be 5 chunks from the same company. You need diversity-aware retrieval that ensures each company is represented.

Problem 3: Numerical Reasoning

"Which company has the highest operating margin?" requires extracting numbers from retrieved chunks and comparing them. The retrieval system needs to surface chunks with actual numerical data — not just narrative descriptions of margins.

Hybrid Search Architecture

The retrieval system combines three search strategies:

Semantic Search (Vector Similarity)

Query pgvector using cosine similarity on embeddings. This excels at finding thematically relevant content:

"concerns about supply chain" finds earnings call discussions about logistics

"competitive pressure" finds MD&A sections discussing market competition

"growth opportunities" finds analyst notes discussing catalysts

Keyword Search (Full-Text)

PostgreSQL ts_vector/ts_query catches exact terms that semantic search might miss:

Ticker symbols: NVDA, AAPL, MSFT

Financial metrics: "gross margin", "EBITDA", "free cash flow"

Filing references: "10-K", "Q3 2024"

Executive names: "Jensen Huang", "CFO"

Structured Filters (SQL WHERE)

Direct database queries for structured attributes:

ticker = 'NVDA' — specific company

fiscal_period = 'Q3-2024' — specific time period

filing_type = '10-K' — annual filings only

source_type = 'transcript' — earnings calls only

is_table = true — financial tables only

section_name = 'Risk Factors' — specific filing section

Reciprocal Rank Fusion (RRF)

Combine results from all three strategies using RRF, which is robust to different score scales:

RRF_score(doc) = sum(1 / (k + rank_in_list)) for each list containing doc

With k=60 (standard), a document ranked #1 in both semantic and keyword gets a combined score of 1/61 + 1/61 = 0.033, while a document ranked #1 in semantic but absent from keyword gets only 0.016. This naturally boosts documents that match on multiple dimensions.

Financial tuning: Weight keyword search higher (1.2x) for queries containing ticker symbols or specific metric names. Weight semantic higher (1.2x) for thematic queries without explicit tickers.

Numerical-Aware Reranking

After RRF fusion, apply financial-specific reranking:

Recency Boost

Newer data is more relevant for most financial queries. A Q3 2024 filing should rank above a Q1 2024 filing when no specific period is requested. The decay function:

recency_score = max(0, 1 - days_old / 180)

This gives full weight to data from the last 6 months, linearly decaying to zero for older data.

Source Authority

Different sources carry different weight depending on the query type:

Query Type	Highest Authority	Lowest Authority
Factual (revenue, margins)	SEC Filings	Analyst Notes
Outlook (guidance, forecasts)	Earnings Transcripts	Internal Reports
Opinion (ratings, targets)	Analyst Notes	SEC Filings
Competitive (positioning)	Internal Reports	Market Data

Numerical Density

For quantitative queries ("compare margins", "revenue growth"), boost chunks with higher numerical density. Count occurrences of patterns like $X.XM, XX.X%, $X,XXX and use them as a ranking signal.

Period Matching

If the query mentions a specific period ("Q3 2024"), boost chunks whose fiscal_period metadata matches exactly. This prevents returning Q2 data when Q3 was explicitly requested.

Diversity Enforcement

For comparison queries, ensure the top results include data from multiple companies. If the top 5 results are all NVDA, demote duplicates and promote other tickers until at least 3 companies are represented.

Context Assembly

The final step before passing data to the LLM: assemble a context window with:

Financial tables first — structured data the LLM can reason about

Supporting narrative — MD&A commentary that explains the numbers

Analyst perspective — external viewpoints on the data

Source attribution — every piece tagged with [Ticker | Period | Source | Section]

Budget management matters: with a 128K context window, you could include everything. Don't. More context means more noise and higher cost. Target 4,000-6,000 tokens of focused, relevant financial data.

Test Queries

Build your retrieval system to handle these representative queries:

"Compare Q3 2024 gross margins for NVDA, AMD, and INTC"

"What did NVDA's CEO say about AI demand in the latest earnings call?"

"Which companies have price targets above current trading price?"

"Show me revenue trends for the semiconductor sector over the last 4 quarters"

This is chapter 3 of AI Finance Analyst.

Get the full hands-on course for $100 and build the complete system. Your projects become your portfolio.

View course details

Ch. 2: Encoding Pipeline

Ch. 4: AI Gateway