ai-mindset · ai-mindset · Oct 21, 2025 · Oct 21, 2025 · Oct 21, 2025 · Oct 21, 2025
diff --git a/_posts/2025-10-21-TIL-hybrid-rag.md b/_posts/2025-10-21-TIL-hybrid-rag.md
@@ -0,0 +1,95 @@
+---
+layout: post
+title: "💡 TIL: Hybrid RAG - Combining the Best of Sparse and Dense Retrieval"
+date: 2025-10-21
+tags: [til, rag, llm, retrieval, ai]
+---
+
+**TL;DR:** Retrieval Augmented Generation (RAG) uses three main retrieval strategies: (1) Sparse retrieval (50 years old) relies on keyword matching via TF-IDF/BM25 - excellent for exact matches but poor with synonyms; (2) Dense retrieval (5-10 years old) uses vector embeddings to capture semantic meaning - better for natural language but misses rare terms; (3) Hybrid retrieval (2-3 years old) combines both approaches with fusion algorithms to merge results. Hybrid retrieval is now the gold standard, balancing precision, recall, and processing speed for modern RAG systems.
+<!--more-->
+
+## RAG Retrieval: The Key to Accurate AI Responses
+
+A RAG system's effectiveness depends largely on its retrieval strategy - how it fetches information to feed into an LLM. The process works by:
+1. Processing a user query
+2. Retrieving relevant chunks from a knowledge base
+3. Feeding those chunks to an LLM
+
+The quality of retrieved information directly impacts the factual accuracy of the LLM's responses.
+
+![Visual comparison of Sparse, Dense, and Hybrid RAG approaches](/images/Hybrid%20RAG.png)
+
+Let's explore the three major retrieval strategies:
+
+## Sparse Retrieval: The Classic Approach (50 years old)
+
+**How it works**: Uses keyword matching through TF-IDF and BM25, counting term frequency in documents and scoring accordingly.
+
+**Pros**:
+- Simple and fast implementation
+- Highly scalable
+- Cost-effective (no embeddings required)
+- Effective for domain-specific terminology
+- Can sometimes outperform complex models for specialised terms
+
+**Cons**:
+- Poor with synonyms and related concepts
+- Limited contextual understanding
+- Struggles with conceptual queries
+
+**Best uses**: Scenarios requiring exact wording - short queries, code search, log analysis, legal clauses.
+
+**Implementations**: Elasticsearch, Apache Lucene, Milvus
+
+## Dense Retrieval: The Semantic Workhorse (5-10 years old)
+
+**How it works**: Maps queries and documents into vector space using embeddings (often called "vector search"), finding results based on semantic similarity.
+
+**Pros**:
+- Strong contextual understanding
+- Handles synonyms and paraphrasing well
+- Flexible for natural language queries
+- Captures content meaning effectively
+
+**Cons**:
+- Misses rare terms and jargon
+- Less effective for very short queries
+- More computationally intensive
+- Requires domain adaptation
+
+**Best uses**: Chatbots, customer service, research over unstructured knowledge bases.
+
+**Implementations**: Meta's FAISS, JVector
+
+## Hybrid Retrieval: The Current State of the Art (2-3 years old)
+
+**How it works**: Combines vector-based and keyword-based search, processing queries through both methods and merging results.
+
+**Pros**:
+- Leverages strengths of both approaches
+- Outperforms dense-only retrieval in benchmarks
+- Improves precision and recall metrics
+- Handles both semantics and rare terms
+
+**Fusion algorithms**:
+- Weighted sum (e.g., 70% dense, 30% sparse)
+- Reciprocal Ranked Fusion (RRF), merging based on ranked positions
+
+**Best uses**: Specialised domains (legal, technical, medical) and general-purpose retrieval requiring high accuracy.
+
+**Implementations**: Elasticsearch, Milvus, Weaviate, DataStax Astra DB
+
+## Why Hybrid Retrieval Leads the Pack
+
+If sparse retrieval is fast but literal, and dense retrieval is contextually aware but misses specific terms, hybrid retrieval offers the best combination:
+
+1. **Complementary strengths**: Semantic matching for concepts, keyword matching for critical terms
+2. **Balanced performance**: Optimises for speed, precision, and recall
+3. **Adaptability**: Works across different domains and query types
+4. **Improved accuracy**: Consistently outperforms single-method approaches
+
+## Conclusion
+
+Retrieval strategies have evolved from simple keyword matching to sophisticated semantic understanding, with hybrid approaches now delivering superior results.
+
+For RAG system developers today, hybrid retrieval offers the most balanced approach - combining the precision of keyword search with the contextual understanding of vector embeddings in a unified solution.
diff --git a/images/Hybrid RAG.png b/images/Hybrid RAG.png
diff --git a/posts.json b/posts.json
@@ -1,4 +1,17 @@
 [
+  {
+    "title": "💡 TIL: Hybrid RAG - Combining the Best of Sparse and Dense Retrieval",
+    "date": "2025-10-21T00:00:00.000Z",
+    "tags": [
+      "til",
+      "rag",
+      "llm",
+      "retrieval",
+      "ai"
+    ],
+    "url": "/posts/TIL-hybrid-rag.html",
+    "content": "<p><strong>TL;DR:</strong> Retrieval Augmented Generation (RAG) uses three main retrieval strategies: (1) Sparse retrieval (50 years old) relies on keyword matching via TF-IDF/BM25 - excellent for exact matches but poor with synonyms; (2) Dense retrieval (5-10 years old) uses vector embeddings to capture semantic meaning - better for natural language but misses rare terms; (3) Hybrid retrieval (2-3 years old) combines both approaches with fusion algorithms to merge results. Hybrid retrieval is now the gold standard, balancing precision, recall, and processing speed for modern RAG systems.</p>\n<!--more-->\n\n<h2 id=\"rag-retrieval-the-key-to-accurate-ai-responses\">RAG Retrieval: The Key to Accurate AI Responses</h2>\n<p>A RAG system&#39;s effectiveness depends largely on its retrieval strategy - how it fetches information to feed into an LLM. The process works by:</p>\n<ol>\n<li>Processing a user query</li>\n<li>Retrieving relevant chunks from a knowledge base</li>\n<li>Feeding those chunks to an LLM</li>\n</ol>\n<p>The quality of retrieved information directly impacts the factual accuracy of the LLM&#39;s responses.</p>\n<p><img src=\"/images/Hybrid%20RAG.png\" alt=\"Visual comparison of Sparse, Dense, and Hybrid RAG approaches\"></p>\n<p>Let&#39;s explore the three major retrieval strategies:</p>\n<h2 id=\"sparse-retrieval-the-classic-approach-50-years-old\">Sparse Retrieval: The Classic Approach (50 years old)</h2>\n<p><strong>How it works</strong>: Uses keyword matching through TF-IDF and BM25, counting term frequency in documents and scoring accordingly.</p>\n<p><strong>Pros</strong>:</p>\n<ul>\n<li>Simple and fast implementation</li>\n<li>Highly scalable</li>\n<li>Cost-effective (no embeddings required)</li>\n<li>Effective for domain-specific terminology</li>\n<li>Can sometimes outperform complex models for specialised terms</li>\n</ul>\n<p><strong>Cons</strong>:</p>\n<ul>\n<li>Poor with synonyms and related concepts</li>\n<li>Limited contextual understanding</li>\n<li>Struggles with conceptual queries</li>\n</ul>\n<p><strong>Best uses</strong>: Scenarios requiring exact wording - short queries, code search, log analysis, legal clauses.</p>\n<p><strong>Implementations</strong>: Elasticsearch, Apache Lucene, Milvus</p>\n<h2 id=\"dense-retrieval-the-semantic-workhorse-5-10-years-old\">Dense Retrieval: The Semantic Workhorse (5-10 years old)</h2>\n<p><strong>How it works</strong>: Maps queries and documents into vector space using embeddings (often called &quot;vector search&quot;), finding results based on semantic similarity.</p>\n<p><strong>Pros</strong>:</p>\n<ul>\n<li>Strong contextual understanding</li>\n<li>Handles synonyms and paraphrasing well</li>\n<li>Flexible for natural language queries</li>\n<li>Captures content meaning effectively</li>\n</ul>\n<p><strong>Cons</strong>:</p>\n<ul>\n<li>Misses rare terms and jargon</li>\n<li>Less effective for very short queries</li>\n<li>More computationally intensive</li>\n<li>Requires domain adaptation</li>\n</ul>\n<p><strong>Best uses</strong>: Chatbots, customer service, research over unstructured knowledge bases.</p>\n<p><strong>Implementations</strong>: Meta&#39;s FAISS, JVector</p>\n<h2 id=\"hybrid-retrieval-the-current-state-of-the-art-2-3-years-old\">Hybrid Retrieval: The Current State of the Art (2-3 years old)</h2>\n<p><strong>How it works</strong>: Combines vector-based and keyword-based search, processing queries through both methods and merging results.</p>\n<p><strong>Pros</strong>:</p>\n<ul>\n<li>Leverages strengths of both approaches</li>\n<li>Outperforms dense-only retrieval in benchmarks</li>\n<li>Improves precision and recall metrics</li>\n<li>Handles both semantics and rare terms</li>\n</ul>\n<p><strong>Fusion algorithms</strong>:</p>\n<ul>\n<li>Weighted sum (e.g., 70% dense, 30% sparse)</li>\n<li>Reciprocal Ranked Fusion (RRF), merging based on ranked positions</li>\n</ul>\n<p><strong>Best uses</strong>: Specialised domains (legal, technical, medical) and general-purpose retrieval requiring high accuracy.</p>\n<p><strong>Implementations</strong>: Elasticsearch, Milvus, Weaviate, DataStax Astra DB</p>\n<h2 id=\"why-hybrid-retrieval-leads-the-pack\">Why Hybrid Retrieval Leads the Pack</h2>\n<p>If sparse retrieval is fast but literal, and dense retrieval is contextually aware but misses specific terms, hybrid retrieval offers the best combination:</p>\n<ol>\n<li><strong>Complementary strengths</strong>: Semantic matching for concepts, keyword matching for critical terms</li>\n<li><strong>Balanced performance</strong>: Optimises for speed, precision, and recall</li>\n<li><strong>Adaptability</strong>: Works across different domains and query types</li>\n<li><strong>Improved accuracy</strong>: Consistently outperforms single-method approaches</li>\n</ol>\n<h2 id=\"conclusion\">Conclusion</h2>\n<p>Retrieval strategies have evolved from simple keyword matching to sophisticated semantic understanding, with hybrid approaches now delivering superior results.</p>\n<p>For RAG system developers today, hybrid retrieval offers the most balanced approach - combining the precision of keyword search with the contextual understanding of vector embeddings in a unified solution.</p>\n"
+  },
   {
     "title": "💡 TIL: Claude Skills - Modular AI Capabilities with Minimal Token Cost",
     "date": "2025-10-17T00:00:00.000Z",

diff --git a/posts/TIL-hybrid-rag.html b/posts/TIL-hybrid-rag.html
@@ -0,0 +1,116 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <title>💡 TIL: Hybrid RAG - Combining the Best of Sparse and Dense Retrieval - Just-in-Time Learning</title>
+  <link rel="stylesheet" href="/style.css">
+</head>
+<body>
+  <header>
+    <div class="container">
+      <h1><a href="/">Just-in-Time Learning</a></h1>
+    </div>
+  </header>
+
+  <main class="post-container">
+    <article class="post">
+      <header class="post-header">
+        <h1>💡 TIL: Hybrid RAG - Combining the Best of Sparse and Dense Retrieval</h1>
+        <span class="post-date">October 21, 2025</span>
+        <div class="post-tags">
+          <span class="post-tag">til</span><span class="post-tag">rag</span><span class="post-tag">llm</span><span class="post-tag">retrieval</span><span class="post-tag">ai</span>
+        </div>
+      </header>
+
+      <div class="post-content">
+        <p><strong>TL;DR:</strong> Retrieval Augmented Generation (RAG) uses three main retrieval strategies: (1) Sparse retrieval (50 years old) relies on keyword matching via TF-IDF/BM25 - excellent for exact matches but poor with synonyms; (2) Dense retrieval (5-10 years old) uses vector embeddings to capture semantic meaning - better for natural language but misses rare terms; (3) Hybrid retrieval (2-3 years old) combines both approaches with fusion algorithms to merge results. Hybrid retrieval is now the gold standard, balancing precision, recall, and processing speed for modern RAG systems.</p>
+<!--more-->
+
+<h2 id="rag-retrieval-the-key-to-accurate-ai-responses">RAG Retrieval: The Key to Accurate AI Responses</h2>
+<p>A RAG system&#39;s effectiveness depends largely on its retrieval strategy - how it fetches information to feed into an LLM. The process works by:</p>
+<ol>
+<li>Processing a user query</li>
+<li>Retrieving relevant chunks from a knowledge base</li>
+<li>Feeding those chunks to an LLM</li>
+</ol>
+<p>The quality of retrieved information directly impacts the factual accuracy of the LLM&#39;s responses.</p>
+<p><img src="/images/Hybrid%20RAG.png" alt="Visual comparison of Sparse, Dense, and Hybrid RAG approaches"></p>
+<p>Let&#39;s explore the three major retrieval strategies:</p>
+<h2 id="sparse-retrieval-the-classic-approach-50-years-old">Sparse Retrieval: The Classic Approach (50 years old)</h2>
+<p><strong>How it works</strong>: Uses keyword matching through TF-IDF and BM25, counting term frequency in documents and scoring accordingly.</p>
+<p><strong>Pros</strong>:</p>
+<ul>
+<li>Simple and fast implementation</li>
+<li>Highly scalable</li>
+<li>Cost-effective (no embeddings required)</li>
+<li>Effective for domain-specific terminology</li>
+<li>Can sometimes outperform complex models for specialised terms</li>
+</ul>
+<p><strong>Cons</strong>:</p>
+<ul>
+<li>Poor with synonyms and related concepts</li>
+<li>Limited contextual understanding</li>
+<li>Struggles with conceptual queries</li>
+</ul>
+<p><strong>Best uses</strong>: Scenarios requiring exact wording - short queries, code search, log analysis, legal clauses.</p>
+<p><strong>Implementations</strong>: Elasticsearch, Apache Lucene, Milvus</p>
+<h2 id="dense-retrieval-the-semantic-workhorse-5-10-years-old">Dense Retrieval: The Semantic Workhorse (5-10 years old)</h2>
+<p><strong>How it works</strong>: Maps queries and documents into vector space using embeddings (often called &quot;vector search&quot;), finding results based on semantic similarity.</p>
+<p><strong>Pros</strong>:</p>
+<ul>
+<li>Strong contextual understanding</li>
+<li>Handles synonyms and paraphrasing well</li>
+<li>Flexible for natural language queries</li>
+<li>Captures content meaning effectively</li>
+</ul>
+<p><strong>Cons</strong>:</p>
+<ul>
+<li>Misses rare terms and jargon</li>
+<li>Less effective for very short queries</li>
+<li>More computationally intensive</li>
+<li>Requires domain adaptation</li>
+</ul>
+<p><strong>Best uses</strong>: Chatbots, customer service, research over unstructured knowledge bases.</p>
+<p><strong>Implementations</strong>: Meta&#39;s FAISS, JVector</p>
+<h2 id="hybrid-retrieval-the-current-state-of-the-art-2-3-years-old">Hybrid Retrieval: The Current State of the Art (2-3 years old)</h2>
+<p><strong>How it works</strong>: Combines vector-based and keyword-based search, processing queries through both methods and merging results.</p>
+<p><strong>Pros</strong>:</p>
+<ul>
+<li>Leverages strengths of both approaches</li>
+<li>Outperforms dense-only retrieval in benchmarks</li>
+<li>Improves precision and recall metrics</li>
+<li>Handles both semantics and rare terms</li>
+</ul>
+<p><strong>Fusion algorithms</strong>:</p>
+<ul>
+<li>Weighted sum (e.g., 70% dense, 30% sparse)</li>
+<li>Reciprocal Ranked Fusion (RRF), merging based on ranked positions</li>
+</ul>
+<p><strong>Best uses</strong>: Specialised domains (legal, technical, medical) and general-purpose retrieval requiring high accuracy.</p>
+<p><strong>Implementations</strong>: Elasticsearch, Milvus, Weaviate, DataStax Astra DB</p>
+<h2 id="why-hybrid-retrieval-leads-the-pack">Why Hybrid Retrieval Leads the Pack</h2>
+<p>If sparse retrieval is fast but literal, and dense retrieval is contextually aware but misses specific terms, hybrid retrieval offers the best combination:</p>
+<ol>
+<li><strong>Complementary strengths</strong>: Semantic matching for concepts, keyword matching for critical terms</li>
+<li><strong>Balanced performance</strong>: Optimises for speed, precision, and recall</li>
+<li><strong>Adaptability</strong>: Works across different domains and query types</li>
+<li><strong>Improved accuracy</strong>: Consistently outperforms single-method approaches</li>
+</ol>
+<h2 id="conclusion">Conclusion</h2>
+<p>Retrieval strategies have evolved from simple keyword matching to sophisticated semantic understanding, with hybrid approaches now delivering superior results.</p>
+<p>For RAG system developers today, hybrid retrieval offers the most balanced approach - combining the precision of keyword search with the contextual understanding of vector embeddings in a unified solution.</p>
+
+      </div>
+    </article>
+  </main>
+
+  <footer>
+    <div class="container">
+      <p>Created with <a href="https://github.com/ai-mindset/init.vim">Neovim</a>, using <a href="https://ai-mindset.github.io/dialogue-engineering">AI</a> to help process and curate content ✨</p>
+    </div>
+  </footer>
+
+  <script src="/script.js"></script>
+</body>
+</html>