Tag: Hierarchical Navigable Small World (HNSW)

Documentation tagged with Hierarchical Navigable Small World (HNSW) in the Geode graph database. HNSW is an algorithm for approximate nearest neighbor (ANN) search in high-dimensional vector spaces, enabling efficient vector similarity search for machine learning applications. <h3 id="introduction-to-hnsw" class="position-relative d-flex align-items-center group"> Introduction to HNSW <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="introduction-to-hnsw" aria-haspopup="dialog" aria-label="Share link: Introduction to HNSW"> Share link </button> </h3><div id="headingShareModal" class="heading-share-modal" role="dialog" aria-modal="true" aria-labelledby="headingShareTitle" hidden> <div class="hsm-dialog" role="document"> <div class="hsm-header"> <h2 id="headingShareTitle" class="h6 mb-0 fw-bold">Share this section</h2> <button type="button" class="hsm-close" aria-label="Close"> </button> </div> <div class="hsm-body"> <label for="headingShareInput" class="form-label small text-muted mb-1 text-uppercase fw-bold" style="font-size: 0.7rem; letter-spacing: 0.5px;">Permalink</label> <div class="input-group mb-4 hsm-url-group"> <input id="headingShareInput" type="text" class="form-control font-monospace" readonly aria-readonly="true" style="font-size: 0.85rem;" /> <button class="btn btn-primary hsm-copy" type="button" aria-label="Copy" title="Copy"> </button> </div> <div class="small fw-bold mb-2 text-muted text-uppercase" style="font-size: 0.7rem; letter-spacing: 0.5px;">Share via</div> <div class="hsm-share-grid"> <a id="share-twitter" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer"> Twitter </a> <a id="share-linkedin" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer"> LinkedIn </a> <a id="share-facebook" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer"> Facebook </a> </div> </div> </div> </div> <style> .heading-share-modal { position: fixed; inset: 0; display: flex; justify-content: center; align-items: center; background: rgba(0, 0, 0, 0.6); z-index: 1050; padding: 1rem; backdrop-filter: blur(4px); -webkit-backdrop-filter: blur(4px); } .heading-share-modal[hidden] { display: none !important; } .hsm-dialog { max-width: 420px; width: 100%; background: var(--bs-body-bg, #fff); color: var(--bs-body-color, #212529); border: 1px solid var(--bs-border-color, rgba(0,0,0,0.1)); border-radius: 1rem; box-shadow: 0 25px 50px -12px rgba(0, 0, 0, 0.25); overflow: hidden; animation: hsm-fade-in 0.2s ease-out; } @keyframes hsm-fade-in { from { opacity: 0; transform: scale(0.95); } to { opacity: 1; transform: scale(1); } } [data-bs-theme="dark"] .hsm-dialog { background: #1e293b; border-color: rgba(255,255,255,0.1); color: #f8f9fa; } .hsm-header { display: flex; justify-content: space-between; align-items: center; padding: 1rem 1.5rem; border-bottom: 1px solid var(--bs-border-color, rgba(0,0,0,0.1)); background: rgba(0,0,0,0.02); } [data-bs-theme="dark"] .hsm-header { background: rgba(255,255,255,0.02); border-color: rgba(255,255,255,0.1); } .hsm-close { background: transparent; border: none; color: inherit; opacity: 0.5; padding: 0.25rem 0.5rem; border-radius: 0.25rem; font-size: 1.2rem; line-height: 1; transition: opacity 0.2s; } .hsm-close:hover { opacity: 1; } .hsm-body { padding: 1.5rem; } .hsm-url-group { display: flex !important; align-items: stretch; } .hsm-url-group .form-control { flex: 1; min-width: 0; margin: 0; background: var(--bs-secondary-bg, #f8f9fa); border-color: var(--bs-border-color, #dee2e6); border-top-right-radius: 0; border-bottom-right-radius: 0; height: 42px; } .hsm-url-group .btn { flex: 0 0 auto; margin: 0; margin-left: -1px; border-top-left-radius: 0; border-bottom-left-radius: 0; height: 42px; display: flex; align-items: center; justify-content: center; padding: 0 1.25rem; z-index: 2; } [data-bs-theme="dark"] .hsm-url-group .form-control { background: #0f172a; border-color: #334155; color: #e2e8f0; } .hsm-share-grid { display: flex; flex-direction: column; gap: 0.5rem; } .hsm-share-grid .btn { display: flex; align-items: center; justify-content: center; font-size: 0.9rem; padding: 0.6rem; border-color: var(--bs-border-color); width: 100%; } [data-bs-theme="dark"] .hsm-share-grid .btn { color: #e2e8f0; border-color: #475569; } [data-bs-theme="dark"] .hsm-share-grid .btn:hover { background: #334155; border-color: #cbd5e1; } </style> <script> (function(){ const modal = document.getElementById('headingShareModal'); if(!modal) return; const input = modal.querySelector('#headingShareInput'); const copyBtn = modal.querySelector('.hsm-copy'); const twitter = modal.querySelector('#share-twitter'); const linkedin = modal.querySelector('#share-linkedin'); const facebook = modal.querySelector('#share-facebook'); const closeBtn = modal.querySelector('.hsm-close'); let lastFocus=null; let trapBound=false; function buildUrl(id){ return window.location.origin + window.location.pathname + '#' + id; } function isOpen(){ return !modal.hasAttribute('hidden'); } function hydrate(id){ const url=buildUrl(id); input.value=url; const enc=encodeURIComponent(url); const text=encodeURIComponent(document.title); if(twitter) twitter.href=`https://twitter.com/intent/tweet?url=${enc}&text=${text}`; if(linkedin) linkedin.href=`https://www.linkedin.com/sharing/share-offsite/?url=${enc}`; if(facebook) facebook.href=`https://www.facebook.com/sharer/sharer.php?u=${enc}`; } function openModal(id){ lastFocus=document.activeElement; hydrate(id); if(!isOpen()){ modal.removeAttribute('hidden'); } requestAnimationFrame(()=>{ input.focus(); }); trapFocus(); } function closeModal(){ if(!isOpen()) return; modal.setAttribute('hidden',''); if(lastFocus && typeof lastFocus.focus==='function') lastFocus.focus(); } function copyCurrent(){ try{ navigator.clipboard.writeText(input.value).then(()=>feedback(true),()=>fallback()); } catch(e){ fallback(); } } function fallback(){ input.select(); try{ document.execCommand('copy'); feedback(true);}catch(e){ feedback(false);} } function feedback(ok){ if(!copyBtn) return; const icon=copyBtn.querySelector('i'); if(!icon) return; const prev=copyBtn.getAttribute('data-prev')||icon.className; if(!copyBtn.getAttribute('data-prev')) copyBtn.setAttribute('data-prev',prev); icon.className= ok ? 'fa-duotone fa-clipboard-check':'fa-duotone fa-circle-exclamation'; setTimeout(()=>{ icon.className=prev; },1800); } function handleShareClick(e){ e.preventDefault(); const btn=e.currentTarget; const id=btn.getAttribute('data-share-target'); if(id) openModal(id); } function bindShareButtons(){ document.querySelectorAll('.h-share').forEach(btn=>{ if(!btn.dataset.hShareBound){ btn.addEventListener('click', handleShareClick); btn.dataset.hShareBound='1'; } }); } bindShareButtons(); if(document.readyState==='loading'){ document.addEventListener('DOMContentLoaded', bindShareButtons); } else { requestAnimationFrame(bindShareButtons); } document.addEventListener('click', function(e){ const shareBtn=e.target.closest && e.target.closest('.h-share'); if(shareBtn && !shareBtn.dataset.hShareBound){ handleShareClick.call(shareBtn, e); } }, true); document.addEventListener('click', e=>{ if(e.target===modal) closeModal(); if(e.target.closest && e.target.closest('.hsm-close')){ e.preventDefault(); closeModal(); } if(copyBtn && (e.target===copyBtn || (e.target.closest && e.target.closest('.hsm-copy')))) { e.preventDefault(); copyCurrent(); } }); document.addEventListener('keydown', e=>{ if(e.key==='Escape' && isOpen()) closeModal(); }); function trapFocus(){ if(trapBound) return; trapBound=true; modal.addEventListener('keydown', f=>{ if(f.key==='Tab' && isOpen()){ const focusable=[...modal.querySelectorAll('a[href],button,input,textarea,select,[tabindex]:not([tabindex="-1"])')].filter(el=>!el.hasAttribute('disabled')); if(!focusable.length) return; const first=focusable[0]; const last=focusable[focusable.length-1]; if(f.shiftKey && document.activeElement===first){ f.preventDefault(); last.focus(); } else if(!f.shiftKey && document.activeElement===last){ f.preventDefault(); first.focus(); } } }); } if(closeBtn) closeBtn.addEventListener('click', e=>{ e.preventDefault(); closeModal(); }); })(); </script>Hierarchical Navigable Small World (HNSW) is a graph-based algorithm for approximate nearest neighbor search that has become the industry standard for vector similarity search. Developed by Yury Malkov and Dmitry Yashunin in 2016, HNSW builds a multi-layer proximity graph that enables logarithmic search complexity while maintaining high recall. The algorithm solves a critical problem in modern AI applications: how to efficiently search through millions or billions of high-dimensional vectors (embeddings) to find the most similar items. Traditional exact search has O(N) complexity—you must compare against every vector. HNSW achieves sub-linear search time through a clever graph structure. HNSW is used in: <ul> <li>Semantic search: Find documents similar to a query embedding</li> <li>Recommendation systems: Discover similar products, content, or users</li> <li>Image search: Find visually similar images</li> <li>Anomaly detection: Identify outliers in embedding space</li> <li>Retrieval-Augmented Generation (RAG): Find relevant context for LLMs</li> </ul> Geode’s HNSW implementation integrates vector search seamlessly with graph queries, enabling powerful combined operations like “find similar products purchased by friends of this user.” <h3 id="core-hnsw-concepts" class="position-relative d-flex align-items-center group"> Core HNSW Concepts <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="core-hnsw-concepts" aria-haspopup="dialog" aria-label="Share link: Core HNSW Concepts"> Share link </button> </h3> <h4 id="navigable-small-world-graphs" class="position-relative d-flex align-items-center group"> Navigable Small World Graphs <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="navigable-small-world-graphs" aria-haspopup="dialog" aria-label="Share link: Navigable Small World Graphs"> Share link </button> </h4>HNSW builds on the concept of “small world” networks—graphs where most nodes can be reached from any other node in a small number of hops, despite the network’s large size. Examples include social networks (six degrees of separation) and the World Wide Web. A navigable small world graph adds long-range connections that enable efficient greedy search: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">Layer 2: A -------- B | | Layer 1: A -- C -- B -- D | | | | Layer 0: A-C-E-B-D-F-G-H (all nodes) </code></pre></div>Search starts at the top layer (sparse, long-range connections) and descends to lower layers (dense, short-range connections), refining the result at each level. <h4 id="hierarchical-construction" class="position-relative d-flex align-items-center group"> Hierarchical Construction <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="hierarchical-construction" aria-haspopup="dialog" aria-label="Share link: Hierarchical Construction"> Share link </button> </h4>HNSW uses a hierarchical structure with multiple layers: <ul> <li>Layer 0: Contains all vectors with dense connections to nearby neighbors</li> <li>Layer 1+: Contain progressively fewer vectors, selected probabilistically</li> <li>Top layer: Has very few nodes, enabling fast initial navigation</li> </ul> Each node’s maximum layer is chosen randomly using an exponentially decaying probability: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">P(layer = l) = (1/M)^l Where M is typically 4-6, giving: - 100% of nodes at layer 0 - ~20% of nodes at layer 1 - ~4% of nodes at layer 2 - ~0.8% of nodes at layer 3 </code></pre></div>This creates a logarithmic search structure similar to skip lists. <h4 id="greedy-search-algorithm" class="position-relative d-flex align-items-center group"> Greedy Search Algorithm <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="greedy-search-algorithm" aria-haspopup="dialog" aria-label="Share link: Greedy Search Algorithm"> Share link </button> </h4>HNSW search is beautifully simple: <ol> <li>Start at entry point: Begin at a random node in the top layer</li> <li>Greedy local search: Move to the neighbor closest to the query</li> <li>Repeat until local minimum: Stop when no neighbor is closer</li> <li>Descend layer: Drop to the next layer, continue search</li> <li>Return results: At layer 0, return k nearest neighbors</li> </ol> This achieves O(log N) complexity in practice. <h4 id="construction-algorithm" class="position-relative d-flex align-items-center group"> Construction Algorithm <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="construction-algorithm" aria-haspopup="dialog" aria-label="Share link: Construction Algorithm"> Share link </button> </h4>Building an HNSW index: <ol> <li>For each vector to insert: <ul> <li>Choose maximum layer l randomly</li> <li>Find nearest neighbors using greedy search</li> <li>Connect to M nearest neighbors at each layer</li> <li>Use Mmax connections at layer 0 for higher accuracy</li> <li>Prune connections to maintain navigability</li> </ul> </li> </ol> The construction is online—you can add vectors incrementally without rebuilding the entire index. <h3 id="how-hnsw-works-in-geode" class="position-relative d-flex align-items-center group"> How HNSW Works in Geode <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="how-hnsw-works-in-geode" aria-haspopup="dialog" aria-label="Share link: How HNSW Works in Geode"> Share link </button> </h3> <h4 id="vector-properties" class="position-relative d-flex align-items-center group"> Vector Properties <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="vector-properties" aria-haspopup="dialog" aria-label="Share link: Vector Properties"> Share link </button> </h4>Store embeddings as node properties: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Create node with embedding INSERT (:Document { id: 'doc-123', title: 'Introduction to Graph Databases', content: '...', embedding: [0.23, -0.45, 0.67, ..., 0.12] -- 768-dimensional vector }); </code></pre></div> <h4 id="creating-hnsw-indexes" class="position-relative d-flex align-items-center group"> Creating HNSW Indexes <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="creating-hnsw-indexes" aria-haspopup="dialog" aria-label="Share link: Creating HNSW Indexes"> Share link </button> </h4>Build an HNSW index on vector properties: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Create HNSW index for semantic search CREATE VECTOR INDEX document_embeddings FOR (d:Document) ON (d.embedding) OPTIONS { dimensions: 768, -- Embedding dimensionality similarity: 'cosine', -- cosine, euclidean, or dot_product m: 16, -- Connections per layer ef_construction: 200, -- Build-time search depth ef_search: 100 -- Query-time search depth }; </code></pre></div>Parameters: <ul> <li>dimensions: Vector dimensionality (e.g., 768 for BERT, 1536 for OpenAI)</li> <li>similarity: Distance metric (cosine, euclidean, dot product)</li> <li>m: Number of bi-directional connections per node (trade-off: higher = better accuracy but more memory)</li> <li>ef_construction: Size of candidate set during construction (higher = better quality graph)</li> <li>ef_search: Size of candidate set during search (higher = better recall but slower)</li> </ul> <h4 id="vector-similarity-search" class="position-relative d-flex align-items-center group"> Vector Similarity Search <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="vector-similarity-search" aria-haspopup="dialog" aria-label="Share link: Vector Similarity Search"> Share link </button> </h4>Query similar vectors: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Find 10 most similar documents to a query embedding MATCH (d:Document) WHERE vector_similarity(d.embedding, $query_embedding) > 0.7 RETURN d.title, d.id, vector_similarity(d.embedding, $query_embedding) AS similarity ORDER BY similarity DESC LIMIT 10; -- Or use dedicated function CALL vector.search({ index: 'document_embeddings', query: $query_embedding, k: 10, ef: 150 -- Override default ef_search for this query }) YIELD node, similarity RETURN node.title, similarity; </code></pre></div> <h4 id="combining-vector-and-graph-queries" class="position-relative d-flex align-items-center group"> Combining Vector and Graph Queries <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="combining-vector-and-graph-queries" aria-haspopup="dialog" aria-label="Share link: Combining Vector and Graph Queries"> Share link </button> </h4>The real power: integrate vector search with graph traversal: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Find similar products purchased by friends MATCH (me:User {id: $userId})-[:FRIEND]->(friend:User) -[:PURCHASED]->(product:Product) WHERE vector_similarity(product.embedding, $query_embedding) > 0.8 RETURN DISTINCT product.name, vector_similarity(product.embedding, $query_embedding) AS similarity, COUNT(DISTINCT friend) AS friend_count ORDER BY similarity DESC, friend_count DESC LIMIT 10; -- Semantic search with metadata filtering MATCH (doc:Document) WHERE doc.category = 'technical' AND doc.publish_date > date('2024-01-01') AND vector_similarity(doc.embedding, $query_embedding) > 0.75 RETURN doc.title, doc.author, vector_similarity(doc.embedding, $query_embedding) AS score ORDER BY score DESC LIMIT 20; </code></pre></div> <h3 id="use-cases" class="position-relative d-flex align-items-center group"> Use Cases <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="use-cases" aria-haspopup="dialog" aria-label="Share link: Use Cases"> Share link </button> </h3> <h4 id="semantic-search" class="position-relative d-flex align-items-center group"> Semantic Search <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="semantic-search" aria-haspopup="dialog" aria-label="Share link: Semantic Search"> Share link </button> </h4>Find documents by meaning, not just keywords: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Traditional keyword search: misses synonyms, context MATCH (d:Document) WHERE d.content CONTAINS 'database' RETURN d.title; -- Semantic search: understands meaning CALL vector.search({ index: 'document_embeddings', query: $query_embedding, -- Embedding of "systems for storing data" k: 10 }) YIELD node RETURN node.title; -- Returns documents about databases, even without keyword "database" </code></pre></div> <h4 id="recommendation-systems" class="position-relative d-flex align-items-center group"> Recommendation Systems <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="recommendation-systems" aria-haspopup="dialog" aria-label="Share link: Recommendation Systems"> Share link </button> </h4>Discover similar items: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Content-based recommendations MATCH (item:Product {id: $productId}) CALL vector.search({ index: 'product_embeddings', query: item.embedding, k: 50 }) YIELD node AS similar_product, similarity WHERE similar_product.id <> $productId AND similar_product.in_stock = true RETURN similar_product.name, similarity ORDER BY similarity DESC LIMIT 10; </code></pre></div> <h4 id="retrieval-augmented-generation-rag" class="position-relative d-flex align-items-center group"> Retrieval-Augmented Generation (RAG) <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="retrieval-augmented-generation-rag" aria-haspopup="dialog" aria-label="Share link: Retrieval-Augmented Generation (RAG)"> Share link </button> </h4>Find relevant context for LLM prompts: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Retrieve relevant context for RAG CALL vector.search({ index: 'knowledge_base', query: $question_embedding, k: 5 }) YIELD node AS doc, similarity RETURN doc.content AS context, similarity ORDER BY similarity DESC; </code></pre></div>Pass the returned context to your LLM to ground responses in your knowledge base. <h4 id="anomaly-detection" class="position-relative d-flex align-items-center group"> Anomaly Detection <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="anomaly-detection" aria-haspopup="dialog" aria-label="Share link: Anomaly Detection"> Share link </button> </h4>Identify outliers in embedding space: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Find anomalous user behavior MATCH (u:User) WITH u, u.behavior_embedding AS embedding CALL vector.search({ index: 'user_behavior', query: embedding, k: 10 }) YIELD node, similarity WITH u, AVG(similarity) AS avg_similarity WHERE avg_similarity < 0.5 -- Low similarity to nearest neighbors = anomaly RETURN u.id, u.name, avg_similarity ORDER BY avg_similarity ASC; </code></pre></div> <h4 id="image-search" class="position-relative d-flex align-items-center group"> Image Search <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="image-search" aria-haspopup="dialog" aria-label="Share link: Image Search"> Share link </button> </h4>Find visually similar images: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Image similarity using CLIP embeddings MATCH (img:Image) WHERE vector_similarity(img.clip_embedding, $query_image_embedding) > 0.85 RETURN img.url, img.caption, vector_similarity(img.clip_embedding, $query_image_embedding) AS similarity ORDER BY similarity DESC LIMIT 20; </code></pre></div> <h3 id="best-practices" class="position-relative d-flex align-items-center group"> Best Practices <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="best-practices" aria-haspopup="dialog" aria-label="Share link: Best Practices"> Share link </button> </h3> <h4 id="choosing-hnsw-parameters" class="position-relative d-flex align-items-center group"> Choosing HNSW Parameters <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="choosing-hnsw-parameters" aria-haspopup="dialog" aria-label="Share link: Choosing HNSW Parameters"> Share link </button> </h4>Balance accuracy, speed, and memory: M (connections per layer): <ul> <li>Low (4-8): Faster search, lower memory, lower recall</li> <li>Medium (12-24): Balanced (recommended for most cases)</li> <li>High (32-64): Better recall, more memory, slightly slower</li> </ul> ef_construction: <ul> <li>Low (50-100): Faster index build, lower quality graph</li> <li>Medium (100-200): Balanced (recommended)</li> <li>High (400-800): Slower build, higher quality graph</li> </ul> ef_search: <ul> <li>Low (10-50): Faster search, lower recall</li> <li>Medium (50-150): Balanced</li> <li>High (200-500): Better recall, slower search</li> <li>Can be adjusted per-query based on accuracy requirements</li> </ul> <h4 id="embedding-generation" class="position-relative d-flex align-items-center group"> Embedding Generation <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="embedding-generation" aria-haspopup="dialog" aria-label="Share link: Embedding Generation"> Share link </button> </h4>Use high-quality embeddings: <ol> <li> Choose appropriate model: <ul> <li>Text: BERT, RoBERTa, sentence-transformers, OpenAI text-embedding-3</li> <li>Images: CLIP, ResNet, EfficientNet</li> <li>Multimodal: CLIP, ALIGN</li> </ul> </li> <li> Normalize embeddings: For cosine similarity, normalize to unit length <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python">embedding = model.encode(text) embedding = embedding / np.linalg.norm(embedding) </code></pre></div></li> <li> Use consistent dimensions: All vectors in an index must have same dimensionality </li> </ol> <h4 id="index-maintenance" class="position-relative d-flex align-items-center group"> Index Maintenance <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="index-maintenance" aria-haspopup="dialog" aria-label="Share link: Index Maintenance"> Share link </button> </h4>Incremental updates: Add vectors as needed <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Add new document to existing index INSERT (:Document { id: 'new-doc', title: 'New Article', embedding: [...] -- Automatically added to index }); </code></pre></div>Rebuild for better quality: Periodically rebuild for optimal graph structure <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Rebuild index with new parameters DROP INDEX document_embeddings; CREATE VECTOR INDEX document_embeddings FOR (d:Document) ON (d.embedding) OPTIONS {m: 20, ef_construction: 300}; </code></pre></div> <h4 id="query-optimization" class="position-relative d-flex align-items-center group"> Query Optimization <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="query-optimization" aria-haspopup="dialog" aria-label="Share link: Query Optimization"> Share link </button> </h4>Pre-filter when possible: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Efficient: Filter before vector search MATCH (d:Document) WHERE d.category = 'science' AND d.year > 2020 AND vector_similarity(d.embedding, $query) > 0.8 RETURN d; -- Less efficient: Vector search then filter CALL vector.search({...}) YIELD node WHERE node.category = 'science' -- Post-filter RETURN node; </code></pre></div>Adjust ef_search for accuracy needs: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- High-recall search (slower) CALL vector.search({index: '...', query: $q, k: 10, ef: 500}) YIELD node, similarity; -- Fast search (may miss some results) CALL vector.search({index: '...', query: $q, k: 10, ef: 50}) YIELD node, similarity; </code></pre></div> <h3 id="performance-characteristics" class="position-relative d-flex align-items-center group"> Performance Characteristics <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="performance-characteristics" aria-haspopup="dialog" aria-label="Share link: Performance Characteristics"> Share link </button> </h3> <h4 id="time-complexity" class="position-relative d-flex align-items-center group"> Time Complexity <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="time-complexity" aria-haspopup="dialog" aria-label="Share link: Time Complexity"> Share link </button> </h4><ul> <li>Construction: O(N log N) expected</li> <li>Search: O(log N) expected</li> <li>Insertion: O(log N) expected</li> <li>Deletion: O(log N) expected</li> </ul> <h4 id="memory-usage" class="position-relative d-flex align-items-center group"> Memory Usage <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="memory-usage" aria-haspopup="dialog" aria-label="Share link: Memory Usage"> Share link </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">Memory = N * (dimensions * 4 bytes + M * 8 bytes per layer) Example (1M vectors, 768 dims, M=16): = 1M * (768 * 4 + 16 * 8 * 2.5 layers) = 1M * (3,072 + 320) = 3.4 GB </code></pre></div> <h4 id="performance-benchmarks" class="position-relative d-flex align-items-center group"> Performance Benchmarks <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="performance-benchmarks" aria-haspopup="dialog" aria-label="Share link: Performance Benchmarks"> Share link </button> </h4>Typical performance (10k vectors, 10-NN): <ul> <li>Latency: 1-5ms at ~90% recall</li> <li>Recall@10: ~90% (parameter dependent)</li> <li>Notes: Performance varies with ef_search, vector dimensions, and hardware</li> </ul> <h3 id="monitoring-and-troubleshooting" class="position-relative d-flex align-items-center group"> Monitoring and Troubleshooting <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="monitoring-and-troubleshooting" aria-haspopup="dialog" aria-label="Share link: Monitoring and Troubleshooting"> Share link </button> </h3> <h4 id="index-statistics" class="position-relative d-flex align-items-center group"> Index Statistics <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="index-statistics" aria-haspopup="dialog" aria-label="Share link: Index Statistics"> Share link </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Check index statistics CALL vector.index.stats('document_embeddings') YIELD vectors, memory_mb, avg_connections, max_layer RETURN vectors, memory_mb, avg_connections, max_layer; </code></pre></div> <h4 id="query-performance" class="position-relative d-flex align-items-center group"> Query Performance <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="query-performance" aria-haspopup="dialog" aria-label="Share link: Query Performance"> Share link </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Profile vector query PROFILE CALL vector.search({index: '...', query: $q, k: 10}) YIELD node, similarity RETURN node, similarity; </code></pre></div> <h4 id="common-issues" class="position-relative d-flex align-items-center group"> Common Issues <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="common-issues" aria-haspopup="dialog" aria-label="Share link: Common Issues"> Share link </button> </h4>Low recall: Increase ef_search or rebuild with higher M/ef_construction High latency: Decrease ef_search or optimize pre-filtering Out of memory: Reduce M, use smaller embeddings, or partition data Slow indexing: Reduce ef_construction or batch inserts <h3 id="related-topics" class="position-relative d-flex align-items-center group"> Related Topics <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="related-topics" aria-haspopup="dialog" aria-label="Share link: Related Topics"> Share link </button> </h3><ul> <li><a href="/tags/vector-search/" >Vector Search</a> - General vector search capabilities</li> <li><a href="/tags/machine-learning/" >Machine Learning</a> - ML integration</li> <li><a href="/tags/embeddings/" >Embeddings</a> - Working with embeddings</li> <li><a href="/tags/search/" >Semantic Search</a> - Semantic search applications</li> <li><a href="/tags/recommendations/" >Recommendation Systems</a> - Building recommenders</li> <li><a href="/tags/bm25/" >BM25 Ranking</a> - Traditional text search ranking</li> </ul> <h3 id="further-reading" class="position-relative d-flex align-items-center group"> Further Reading <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="further-reading" aria-haspopup="dialog" aria-label="Share link: Further Reading"> Share link </button> </h3><ul> <li><a href="/tags/vector-search/" >Vector Search</a> - Complete vector search documentation</li> <li><a href="/tags/embeddings/" >Embeddings</a> - Working with embeddings</li> <li><a href="/docs/performance/" >Performance</a> - Performance optimization</li> <li><a href="/tags/ai/" >AI Integration</a> - AI and machine learning integration</li> <li><a href="https://arxiv.org/abs/1603.09320" aria-label="Original HNSW Paper – opens in new window" target="_blank" rel="noopener noreferrer" >Original HNSW Paper ↗ </a> - Academic paper</li> </ul> Geode’s HNSW implementation brings vector similarity search to graph databases, enabling AI applications that combine semantic understanding with graph relationships. <h3 id="advanced-hnsw-implementation-details" class="position-relative d-flex align-items-center group"> Advanced HNSW Implementation Details <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="advanced-hnsw-implementation-details" aria-haspopup="dialog" aria-label="Share link: Advanced HNSW Implementation Details"> Share link </button> </h3> <h4 id="layer-assignment-probability" class="position-relative d-flex align-items-center group"> Layer Assignment Probability <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="layer-assignment-probability" aria-haspopup="dialog" aria-label="Share link: Layer Assignment Probability"> Share link </button> </h4>HNSW layers are assigned using exponential decay: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">P(layer = l) = (1/M)^l For M = 4: - Layer 0: 100% of nodes - Layer 1: 25% of nodes - Layer 2: 6.25% of nodes - Layer 3: 1.56% of nodes </code></pre></div> <h4 id="connection-strategy" class="position-relative d-flex align-items-center group"> Connection Strategy <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="connection-strategy" aria-haspopup="dialog" aria-label="Share link: Connection Strategy"> Share link </button> </h4>Each node maintains: <ul> <li>M connections at layers > 0</li> <li>2M connections at layer 0 (base layer for higher recall)</li> </ul> Heuristic selection: Choose neighbors that maximize navigability: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">score(candidate) = distance(query, candidate) - distance(query, current_best) </code></pre></div> <h3 id="query-optimization-strategies" class="position-relative d-flex align-items-center group"> Query Optimization Strategies <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="query-optimization-strategies" aria-haspopup="dialog" aria-label="Share link: Query Optimization Strategies"> Share link </button> </h3> <h4 id="adaptive-ef_search" class="position-relative d-flex align-items-center group"> Adaptive ef_search <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="adaptive-ef_search" aria-haspopup="dialog" aria-label="Share link: Adaptive ef_search"> Share link </button> </h4>Dynamically adjust search effort: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- High-stakes query: maximize recall CALL vector.search({ index: 'critical_docs', query: $query, k: 10, ef: 500 // Deep search }) YIELD node, similarity; -- Batch processing: optimize throughput CALL vector.search({ index: 'product_catalog', query: $query, k: 10, ef: 32 // Fast approximate search }) YIELD node, similarity; </code></pre></div> <h4 id="early-termination" class="position-relative d-flex align-items-center group"> Early Termination <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="early-termination" aria-haspopup="dialog" aria-label="Share link: Early Termination"> Share link </button> </h4>Stop search when confidence is high: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python">async def adaptive_vector_search(client, query_emb, min_confidence=0.95): for ef in [32, 64, 128, 256, 512]: results, _ = await client.query(""" CALL vector.search({ index: 'embeddings', query: $query, k: 10, ef: $ef }) YIELD node, similarity RETURN node, similarity ORDER BY similarity DESC """, {"query": query_emb, "ef": ef}) top_score = results.rows[0]['similarity'] if top_score >= min_confidence: return results.rows # Early termination return results.rows # Max effort reached </code></pre></div> <h3 id="index-construction-strategies" class="position-relative d-flex align-items-center group"> Index Construction Strategies <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="index-construction-strategies" aria-haspopup="dialog" aria-label="Share link: Index Construction Strategies"> Share link </button> </h3> <h4 id="bulk-loading-optimization" class="position-relative d-flex align-items-center group"> Bulk Loading Optimization <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="bulk-loading-optimization" aria-haspopup="dialog" aria-label="Share link: Bulk Loading Optimization"> Share link </button> </h4>Build index from sorted data: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Sort nodes by degree (high-degree nodes first) MATCH (n:Node) WITH n, SIZE((n)-[:RELATED]-()) AS degree ORDER BY degree DESC -- Insert in batches CALL { WITH n SET n.embedding = $computed_embedding } IN TRANSACTIONS OF 10000 ROWS; -- Rebuild index after bulk load DROP INDEX embeddings_idx; CREATE VECTOR INDEX embeddings_idx FOR (n:Node) ON (n.embedding) OPTIONS {m: 16, ef_construction: 200}; </code></pre></div> <h4 id="incremental-index-maintenance" class="position-relative d-flex align-items-center group"> Incremental Index Maintenance <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="incremental-index-maintenance" aria-haspopup="dialog" aria-label="Share link: Incremental Index Maintenance"> Share link </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Track index quality metric MATCH (stats:IndexStats {index_name: 'embeddings_idx'}) WITH stats.inserts_since_rebuild AS inserts, stats.total_vectors AS total WITH inserts * 1.0 / total AS insert_ratio WHERE insert_ratio > 0.1 // 10% growth -- Trigger rebuild CALL vector.index.rebuild('embeddings_idx', { m: 20, // Increase connections for larger graph ef_construction: 300 }); </code></pre></div> <h3 id="distance-metrics-deep-dive" class="position-relative d-flex align-items-center group"> Distance Metrics Deep Dive <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="distance-metrics-deep-dive" aria-haspopup="dialog" aria-label="Share link: Distance Metrics Deep Dive"> Share link </button> </h3> <h4 id="cosine-similarity" class="position-relative d-flex align-items-center group"> Cosine Similarity <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="cosine-similarity" aria-haspopup="dialog" aria-label="Share link: Cosine Similarity"> Share link </button> </h4>Best for normalized vectors: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">cosine(u, v) = (u · v) / (||u|| × ||v||) = Σ(ui × vi) / sqrt(Σui²) × sqrt(Σvi²) Range: [-1, 1] - 1: Identical direction - 0: Orthogonal - -1: Opposite direction </code></pre></div>Optimization: Pre-normalize vectors to unit length, then use dot product: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Pre-normalize at insertion MATCH (n:Node) SET n.embedding = vector.normalize(n.embedding); -- Use dot product (equivalent to cosine for normalized vectors) CALL vector.search({ index: 'normalized_embeddings', query: vector.normalize($query), metric: 'dot_product' // Faster than cosine }); </code></pre></div> <h4 id="euclidean-distance-l2" class="position-relative d-flex align-items-center group"> Euclidean Distance (L2) <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="euclidean-distance-l2" aria-haspopup="dialog" aria-label="Share link: Euclidean Distance (L2)"> Share link </button> </h4>Measures absolute distance: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">euclidean(u, v) = sqrt(Σ(ui - vi)²) Properties: - Sensitive to magnitude - Triangle inequality holds - Metric space properties </code></pre></div> <h4 id="manhattan-distance-l1" class="position-relative d-flex align-items-center group"> Manhattan Distance (L1) <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="manhattan-distance-l1" aria-haspopup="dialog" aria-label="Share link: Manhattan Distance (L1)"> Share link </button> </h4>Sum of absolute differences: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback">manhattan(u, v) = Σ|ui - vi| Use cases: - Sparse vectors - Grid-based distances - Outlier-robust similarity </code></pre></div> <h3 id="production-deployment-patterns" class="position-relative d-flex align-items-center group"> Production Deployment Patterns <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="production-deployment-patterns" aria-haspopup="dialog" aria-label="Share link: Production Deployment Patterns"> Share link </button> </h3> <h4 id="multi-index-strategy" class="position-relative d-flex align-items-center group"> Multi-Index Strategy <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="multi-index-strategy" aria-haspopup="dialog" aria-label="Share link: Multi-Index Strategy"> Share link </button> </h4>Separate indexes for different embedding types: <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Text embeddings (768d, BERT) CREATE VECTOR INDEX text_embeddings FOR (d:Document) ON (d.text_embedding) OPTIONS {dimensions: 768, metric: 'cosine', m: 16}; -- Image embeddings (512d, CLIP) CREATE VECTOR INDEX image_embeddings FOR (p:Product) ON (p.image_embedding) OPTIONS {dimensions: 512, metric: 'cosine', m: 20}; -- User embeddings (128d, custom) CREATE VECTOR INDEX user_embeddings FOR (u:User) ON (u.behavior_embedding) OPTIONS {dimensions: 128, metric: 'euclidean', m: 12}; -- Query appropriate index CALL vector.search({index: 'text_embeddings', query: $text_query}) YIELD node; </code></pre></div> <h4 id="backup-and-recovery" class="position-relative d-flex align-items-center group"> Backup and Recovery <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="backup-and-recovery" aria-haspopup="dialog" aria-label="Share link: Backup and Recovery"> Share link </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql">-- Export index state CALL vector.index.export('embeddings_idx', '/backup/embeddings_idx_20250124.bin'); -- Restore from backup CALL vector.index.import('embeddings_idx', '/backup/embeddings_idx_20250124.bin'); -- Verify index integrity CALL vector.index.verify('embeddings_idx') YIELD vectors, corrupted_entries, avg_degree WHERE corrupted_entries = 0 RETURN 'Index healthy' AS status; </code></pre></div> <h3 id="benchmarking-and-performance" class="position-relative d-flex align-items-center group"> Benchmarking and Performance <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="benchmarking-and-performance" aria-haspopup="dialog" aria-label="Share link: Benchmarking and Performance"> Share link </button> </h3> <h4 id="recall-measurement" class="position-relative d-flex align-items-center group"> Recall Measurement <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="recall-measurement" aria-haspopup="dialog" aria-label="Share link: Recall Measurement"> Share link </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python">async def measure_recall(client, test_queries, ground_truth, k=10): recalls = [] for query_emb, true_neighbors in zip(test_queries, ground_truth): # HNSW approximate search results, _ = await client.query(""" CALL vector.search({ index: 'test_index', query: $query, k: $k }) YIELD node RETURN node.id AS id """, {"query": query_emb, "k": k}) retrieved = {row['id'] for row in results.rows} relevant = set(true_neighbors[:k]) recall = len(retrieved & relevant) / k recalls.append(recall) return np.mean(recalls) # Typical results: # ef_search=50, M=16: recall@10 ≈ 0.93, latency ≈ 2ms # ef_search=100, M=16: recall@10 ≈ 0.97, latency ≈ 5ms # ef_search=200, M=32: recall@10 ≈ 0.99, latency ≈ 15ms </code></pre></div> <h4 id="throughput-vs-latency-trade-offs" class="position-relative d-flex align-items-center group"> Throughput vs Latency Trade-offs <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="throughput-vs-latency-trade-offs" aria-haspopup="dialog" aria-label="Share link: Throughput vs Latency Trade-offs"> Share link </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"># Latency-optimized (single query) config_latency = { "m": 12, "ef_construction": 100, "ef_search": 32, "batch_size": 1 } # Latency and recall depend on dataset, dimensions, and ef_search # Throughput-optimized (batch queries) config_throughput = { "m": 16, "ef_construction": 200, "ef_search": 64, "batch_size": 100 } # Throughput and recall depend on dataset, dimensions, and hardware </code></pre></div> <h3 id="further-reading-1" class="position-relative d-flex align-items-center group"> Further Reading <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="further-reading-1" aria-haspopup="dialog" aria-label="Share link: Further Reading"> Share link </button> </h3><ul> <li>HNSW Original Paper: Malkov & Yashunin (2016) - Efficient and Robust ANN Search</li> <li>Graph-Based ANN: NSW, HNSW, NSG, and DiskANN Algorithms</li> <li>Distance Metrics: Cosine, Euclidean, Inner Product, and Custom Metrics</li> <li>Index Tuning: M, ef_construction, ef_search Parameter Optimization</li> <li>Production Systems: Scaling to Billions of Vectors</li> </ul> Browse tagged content for comprehensive HNSW and vector search documentation.

Popular

Related Articles

Querying, Indexing, and Query Optimization