Back to guides
3
4 min

Semantic Search

Finding Meaning, Not Keywords

The Problem With Keyword Search

You saved a bookmark about "flow states and deep focus techniques." A week later you search for "how to concentrate better." Keyword search returns nothing — none of those exact words appear in the bookmark.

Semantic search solves this. It understands that "concentrate better" and "deep focus techniques" mean the same thing, even though they share zero words.

How Embeddings Work

An embedding is a list of numbers (a vector) that represents the *meaning* of a piece of text. Two texts with similar meanings produce vectors that point in similar directions.

"how to concentrate better"     → [0.23, 0.87, -0.14, 0.56, ...]
"deep focus techniques"         → [0.21, 0.85, -0.11, 0.59, ...]  ← similar!
"quarterly revenue projections" → [-0.67, 0.12, 0.93, -0.31, ...] ← different

The embedding model (like OpenAI's text-embedding-3-small or Cohere's embed-v3) has learned these meaning-to-number mappings from billions of text examples.

Cosine Similarity

Once everything is a vector, you need a way to compare them. Cosine similarity measures the angle between two vectors:

ScoreMeaning
1.0Identical meaning
0.8-0.9Strongly related
0.5-0.7Somewhat related
0.0-0.4Unrelated

The formula is straightforward — dot product divided by the product of magnitudes:

function cosineSimilarity(a: number[], b: number[]): number {
  let dot = 0, magA = 0, magB = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    magA += a[i] * a[i];
    magB += b[i] * b[i];
  }
  return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}

The Search Pipeline

Semantic search follows a simple flow:

User Query → Embed Query → Compare vs All Chunk Embeddings → Rank by Similarity → Return Top-K

At query time, you embed the query once, then compare it against every chunk embedding. This is fast because vector comparison is just multiplication and addition.

Filtered Search

Pure semantic search searches everything. But often you want to narrow the scope first:

  • "Find my meeting notes about the API migration" → filter: source=meetings, then semantic search for "API migration"
  • "Recent articles about React" → filter: source=articles AND date > 30 days ago, then semantic search for "React"
  • Filtering first, then searching semantically over the filtered set, is both faster and more accurate.

    Hybrid Search: Best of Both

    Sometimes keyword search catches what semantic search misses (exact names, codes, acronyms) and vice versa. Hybrid search combines both:

  • Run keyword search → get matches with BM25 scores
  • Run semantic search → get matches with cosine similarity scores
  • Normalize both score sets to 0-1
  • Combine: final_score = 0.7 * semantic + 0.3 * keyword
  • The weighting (70/30 toward semantic) works well for personal knowledge bases where you rarely search by exact terms.

    Practical Considerations

  • Embedding dimensions: 384-1536 numbers per chunk. More dimensions = more nuance but more storage and compute
  • Batch embedding: Embed all chunks once at ingestion time, then only embed new additions
  • Re-embedding: If you change your embedding model, you need to re-embed everything
  • Storage: For a personal knowledge base (hundreds to low thousands of chunks), in-memory storage is fine. No vector database needed.
  • Key Takeaways

  • Semantic search finds meaning, not keywords — "concentrate better" matches "deep focus techniques."
  • Embeddings convert text to number vectors. Similar meanings produce similar vectors.
  • Cosine similarity scores range from 0 (unrelated) to 1 (identical meaning).
  • Filter first, then search semantically — faster and more precise.
  • This is chapter 3 of AI-Powered Second Brain.

    Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

    View course details