<!-- CANARY: REQ=REQ-DOCS-001; FEATURE="Docs"; ASPECT=Documentation; STATUS=TESTED; OWNER=docs; UPDATED=2026-01-15 -->
<p>Documentation tagged with <strong>Hierarchical Navigable Small World (HNSW)</strong> in the Geode graph database. HNSW is an algorithm for approximate nearest neighbor (ANN) search in high-dimensional vector spaces, enabling efficient vector similarity search for machine learning applications.</p>
<h3 id="introduction-to-hnsw" class="position-relative d-flex align-items-center group">
<span>Introduction to HNSW</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="introduction-to-hnsw"
aria-haspopup="dialog"
aria-label="Share link: Introduction to HNSW">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3><div id="headingShareModal" class="heading-share-modal" role="dialog" aria-modal="true" aria-labelledby="headingShareTitle" hidden>
<div class="hsm-dialog" role="document">
<div class="hsm-header">
<h2 id="headingShareTitle" class="h6 mb-0 fw-bold">Share this section</h2>
<button type="button" class="hsm-close" aria-label="Close">
<i class="fa-solid fa-xmark"></i>
</button>
</div>
<div class="hsm-body">
<label for="headingShareInput" class="form-label small text-muted mb-1 text-uppercase fw-bold" style="font-size: 0.7rem; letter-spacing: 0.5px;">Permalink</label>
<div class="input-group mb-4 hsm-url-group">
<input id="headingShareInput" type="text" class="form-control font-monospace" readonly aria-readonly="true" style="font-size: 0.85rem;" />
<button class="btn btn-primary hsm-copy" type="button" aria-label="Copy" title="Copy">
<i class="fa-duotone fa-clipboard" aria-hidden="true"></i>
</button>
</div>
<div class="small fw-bold mb-2 text-muted text-uppercase" style="font-size: 0.7rem; letter-spacing: 0.5px;">Share via</div>
<div class="hsm-share-grid">
<a id="share-twitter" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer">
<i class="fa-brands fa-twitter me-2"></i>Twitter
</a>
<a id="share-linkedin" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer">
<i class="fa-brands fa-linkedin me-2"></i>LinkedIn
</a>
<a id="share-facebook" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer">
<i class="fa-brands fa-facebook me-2"></i>Facebook
</a>
</div>
</div>
</div>
</div>
<style>
.heading-share-modal {
position: fixed;
inset: 0;
display: flex;
justify-content: center;
align-items: center;
background: rgba(0, 0, 0, 0.6);
z-index: 1050;
padding: 1rem;
backdrop-filter: blur(4px);
-webkit-backdrop-filter: blur(4px);
}
.heading-share-modal[hidden] { display: none !important; }
.hsm-dialog {
max-width: 420px;
width: 100%;
background: var(--bs-body-bg, #fff);
color: var(--bs-body-color, #212529);
border: 1px solid var(--bs-border-color, rgba(0,0,0,0.1));
border-radius: 1rem;
box-shadow: 0 25px 50px -12px rgba(0, 0, 0, 0.25);
overflow: hidden;
animation: hsm-fade-in 0.2s ease-out;
}
@keyframes hsm-fade-in {
from { opacity: 0; transform: scale(0.95); }
to { opacity: 1; transform: scale(1); }
}
[data-bs-theme="dark"] .hsm-dialog {
background: #1e293b;
border-color: rgba(255,255,255,0.1);
color: #f8f9fa;
}
.hsm-header {
display: flex;
justify-content: space-between;
align-items: center;
padding: 1rem 1.5rem;
border-bottom: 1px solid var(--bs-border-color, rgba(0,0,0,0.1));
background: rgba(0,0,0,0.02);
}
[data-bs-theme="dark"] .hsm-header {
background: rgba(255,255,255,0.02);
border-color: rgba(255,255,255,0.1);
}
.hsm-close {
background: transparent;
border: none;
color: inherit;
opacity: 0.5;
padding: 0.25rem 0.5rem;
border-radius: 0.25rem;
font-size: 1.2rem;
line-height: 1;
transition: opacity 0.2s;
}
.hsm-close:hover {
opacity: 1;
}
.hsm-body {
padding: 1.5rem;
}
.hsm-url-group {
display: flex !important;
align-items: stretch;
}
.hsm-url-group .form-control {
flex: 1;
min-width: 0;
margin: 0;
background: var(--bs-secondary-bg, #f8f9fa);
border-color: var(--bs-border-color, #dee2e6);
border-top-right-radius: 0;
border-bottom-right-radius: 0;
height: 42px;
}
.hsm-url-group .btn {
flex: 0 0 auto;
margin: 0;
margin-left: -1px;
border-top-left-radius: 0;
border-bottom-left-radius: 0;
height: 42px;
display: flex;
align-items: center;
justify-content: center;
padding: 0 1.25rem;
z-index: 2;
}
[data-bs-theme="dark"] .hsm-url-group .form-control {
background: #0f172a;
border-color: #334155;
color: #e2e8f0;
}
.hsm-share-grid {
display: flex;
flex-direction: column;
gap: 0.5rem;
}
.hsm-share-grid .btn {
display: flex;
align-items: center;
justify-content: center;
font-size: 0.9rem;
padding: 0.6rem;
border-color: var(--bs-border-color);
width: 100%;
}
[data-bs-theme="dark"] .hsm-share-grid .btn {
color: #e2e8f0;
border-color: #475569;
}
[data-bs-theme="dark"] .hsm-share-grid .btn:hover {
background: #334155;
border-color: #cbd5e1;
}
</style>
<script>
(function(){
const modal = document.getElementById('headingShareModal');
if(!modal) return;
const input = modal.querySelector('#headingShareInput');
const copyBtn = modal.querySelector('.hsm-copy');
const twitter = modal.querySelector('#share-twitter');
const linkedin = modal.querySelector('#share-linkedin');
const facebook = modal.querySelector('#share-facebook');
const closeBtn = modal.querySelector('.hsm-close');
let lastFocus=null;
let trapBound=false;
function buildUrl(id){ return window.location.origin + window.location.pathname + '#' + id; }
function isOpen(){ return !modal.hasAttribute('hidden'); }
function hydrate(id){
const url=buildUrl(id);
input.value=url;
const enc=encodeURIComponent(url);
const text=encodeURIComponent(document.title);
if(twitter) twitter.href=`https://twitter.com/intent/tweet?url=${enc}&text=${text}`;
if(linkedin) linkedin.href=`https://www.linkedin.com/sharing/share-offsite/?url=${enc}`;
if(facebook) facebook.href=`https://www.facebook.com/sharer/sharer.php?u=${enc}`;
}
function openModal(id){
lastFocus=document.activeElement;
hydrate(id);
if(!isOpen()){
modal.removeAttribute('hidden');
}
requestAnimationFrame(()=>{ input.focus(); });
trapFocus();
}
function closeModal(){
if(!isOpen()) return;
modal.setAttribute('hidden','');
if(lastFocus && typeof lastFocus.focus==='function') lastFocus.focus();
}
function copyCurrent(){
try{ navigator.clipboard.writeText(input.value).then(()=>feedback(true),()=>fallback()); }
catch(e){ fallback(); }
}
function fallback(){ input.select(); try{ document.execCommand('copy'); feedback(true);}catch(e){ feedback(false);} }
function feedback(ok){ if(!copyBtn) return; const icon=copyBtn.querySelector('i'); if(!icon) return; const prev=copyBtn.getAttribute('data-prev')||icon.className; if(!copyBtn.getAttribute('data-prev')) copyBtn.setAttribute('data-prev',prev); icon.className= ok ? 'fa-duotone fa-clipboard-check':'fa-duotone fa-circle-exclamation'; setTimeout(()=>{ icon.className=prev; },1800); }
function handleShareClick(e){ e.preventDefault(); const btn=e.currentTarget; const id=btn.getAttribute('data-share-target'); if(id) openModal(id); }
function bindShareButtons(){
document.querySelectorAll('.h-share').forEach(btn=>{
if(!btn.dataset.hShareBound){ btn.addEventListener('click', handleShareClick); btn.dataset.hShareBound='1'; }
});
}
bindShareButtons();
if(document.readyState==='loading'){
document.addEventListener('DOMContentLoaded', bindShareButtons);
} else {
requestAnimationFrame(bindShareButtons);
}
document.addEventListener('click', function(e){
const shareBtn=e.target.closest && e.target.closest('.h-share');
if(shareBtn && !shareBtn.dataset.hShareBound){ handleShareClick.call(shareBtn, e); }
}, true);
document.addEventListener('click', e=>{
if(e.target===modal) closeModal();
if(e.target.closest && e.target.closest('.hsm-close')){ e.preventDefault(); closeModal(); }
if(copyBtn && (e.target===copyBtn || (e.target.closest && e.target.closest('.hsm-copy')))) { e.preventDefault(); copyCurrent(); }
});
document.addEventListener('keydown', e=>{ if(e.key==='Escape' && isOpen()) closeModal(); });
function trapFocus(){
if(trapBound) return;
trapBound=true;
modal.addEventListener('keydown', f=>{ if(f.key==='Tab' && isOpen()){ const focusable=[...modal.querySelectorAll('a[href],button,input,textarea,select,[tabindex]:not([tabindex="-1"])')].filter(el=>!el.hasAttribute('disabled')); if(!focusable.length) return; const first=focusable[0]; const last=focusable[focusable.length-1]; if(f.shiftKey && document.activeElement===first){ f.preventDefault(); last.focus(); } else if(!f.shiftKey && document.activeElement===last){ f.preventDefault(); first.focus(); } } });
}
if(closeBtn) closeBtn.addEventListener('click', e=>{ e.preventDefault(); closeModal(); });
})();
</script><p>Hierarchical Navigable Small World (HNSW) is a graph-based algorithm for approximate nearest neighbor search that has become the industry standard for vector similarity search. Developed by Yury Malkov and Dmitry Yashunin in 2016, HNSW builds a multi-layer proximity graph that enables logarithmic search complexity while maintaining high recall.</p>
<p>The algorithm solves a critical problem in modern AI applications: how to efficiently search through millions or billions of high-dimensional vectors (embeddings) to find the most similar items. Traditional exact search has O(N) complexity—you must compare against every vector. HNSW achieves sub-linear search time through a clever graph structure.</p>
<p>HNSW is used in:</p>
<ul>
<li><strong>Semantic search</strong>: Find documents similar to a query embedding</li>
<li><strong>Recommendation systems</strong>: Discover similar products, content, or users</li>
<li><strong>Image search</strong>: Find visually similar images</li>
<li><strong>Anomaly detection</strong>: Identify outliers in embedding space</li>
<li><strong>Retrieval-Augmented Generation (RAG)</strong>: Find relevant context for LLMs</li>
</ul>
<p>Geode’s HNSW implementation integrates vector search seamlessly with graph queries, enabling powerful combined operations like “find similar products purchased by friends of this user.”</p>
<h3 id="core-hnsw-concepts" class="position-relative d-flex align-items-center group">
<span>Core HNSW Concepts</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="core-hnsw-concepts"
aria-haspopup="dialog"
aria-label="Share link: Core HNSW Concepts">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="navigable-small-world-graphs" class="position-relative d-flex align-items-center group">
<span>Navigable Small World Graphs</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="navigable-small-world-graphs"
aria-haspopup="dialog"
aria-label="Share link: Navigable Small World Graphs">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>HNSW builds on the concept of “small world” networks—graphs where most nodes can be reached from any other node in a small number of hops, despite the network’s large size. Examples include social networks (six degrees of separation) and the World Wide Web.</p>
<p>A navigable small world graph adds long-range connections that enable efficient greedy search:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">Layer 2: A -------- B
</span></span><span class="line"><span class="cl"> | |
</span></span><span class="line"><span class="cl">Layer 1: A -- C -- B -- D
</span></span><span class="line"><span class="cl"> | | | |
</span></span><span class="line"><span class="cl">Layer 0: A-C-E-B-D-F-G-H (all nodes)
</span></span></code></pre></div><p>Search starts at the top layer (sparse, long-range connections) and descends to lower layers (dense, short-range connections), refining the result at each level.</p>
<h4 id="hierarchical-construction" class="position-relative d-flex align-items-center group">
<span>Hierarchical Construction</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="hierarchical-construction"
aria-haspopup="dialog"
aria-label="Share link: Hierarchical Construction">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>HNSW uses a hierarchical structure with multiple layers:</p>
<ul>
<li><strong>Layer 0</strong>: Contains all vectors with dense connections to nearby neighbors</li>
<li><strong>Layer 1+</strong>: Contain progressively fewer vectors, selected probabilistically</li>
<li><strong>Top layer</strong>: Has very few nodes, enabling fast initial navigation</li>
</ul>
<p>Each node’s maximum layer is chosen randomly using an exponentially decaying probability:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">P(layer = l) = (1/M)^l
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Where M is typically 4-6, giving:
</span></span><span class="line"><span class="cl">- 100% of nodes at layer 0
</span></span><span class="line"><span class="cl">- ~20% of nodes at layer 1
</span></span><span class="line"><span class="cl">- ~4% of nodes at layer 2
</span></span><span class="line"><span class="cl">- ~0.8% of nodes at layer 3
</span></span></code></pre></div><p>This creates a logarithmic search structure similar to skip lists.</p>
<h4 id="greedy-search-algorithm" class="position-relative d-flex align-items-center group">
<span>Greedy Search Algorithm</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="greedy-search-algorithm"
aria-haspopup="dialog"
aria-label="Share link: Greedy Search Algorithm">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>HNSW search is beautifully simple:</p>
<ol>
<li><strong>Start at entry point</strong>: Begin at a random node in the top layer</li>
<li><strong>Greedy local search</strong>: Move to the neighbor closest to the query</li>
<li><strong>Repeat until local minimum</strong>: Stop when no neighbor is closer</li>
<li><strong>Descend layer</strong>: Drop to the next layer, continue search</li>
<li><strong>Return results</strong>: At layer 0, return k nearest neighbors</li>
</ol>
<p>This achieves O(log N) complexity in practice.</p>
<h4 id="construction-algorithm" class="position-relative d-flex align-items-center group">
<span>Construction Algorithm</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="construction-algorithm"
aria-haspopup="dialog"
aria-label="Share link: Construction Algorithm">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Building an HNSW index:</p>
<ol>
<li><strong>For each vector to insert</strong>:
<ul>
<li>Choose maximum layer l randomly</li>
<li>Find nearest neighbors using greedy search</li>
<li>Connect to M nearest neighbors at each layer</li>
<li>Use Mmax connections at layer 0 for higher accuracy</li>
<li>Prune connections to maintain navigability</li>
</ul>
</li>
</ol>
<p>The construction is online—you can add vectors incrementally without rebuilding the entire index.</p>
<h3 id="how-hnsw-works-in-geode" class="position-relative d-flex align-items-center group">
<span>How HNSW Works in Geode</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="how-hnsw-works-in-geode"
aria-haspopup="dialog"
aria-label="Share link: How HNSW Works in Geode">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="vector-properties" class="position-relative d-flex align-items-center group">
<span>Vector Properties</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="vector-properties"
aria-haspopup="dialog"
aria-label="Share link: Vector Properties">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Store embeddings as node properties:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Create</span><span class="w"> </span><span class="py">node</span><span class="w"> </span><span class="py">with</span><span class="w"> </span><span class="py">embedding</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">INSERT</span><span class="w"> </span><span class="p">(:</span><span class="nc">Document</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">id</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">doc</span><span class="err">-</span><span class="py">123</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">title</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">Introduction</span><span class="w"> </span><span class="py">to</span><span class="w"> </span><span class="py">Graph</span><span class="w"> </span><span class="py">Databases</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">content</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="kd">...</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">embedding</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="nc">0</span><span class="mf">.23</span><span class="p">,</span><span class="w"> </span><span class="err">-</span><span class="py">0</span><span class="mf">.45</span><span class="p">,</span><span class="w"> </span><span class="py">0</span><span class="mf">.67</span><span class="p">,</span><span class="w"> </span><span class="kd">...</span><span class="p">,</span><span class="w"> </span><span class="py">0</span><span class="mf">.12</span><span class="p">]</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">768</span><span class="err">-</span><span class="py">dimensional</span><span class="w"> </span><span class="py">vector</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="creating-hnsw-indexes" class="position-relative d-flex align-items-center group">
<span>Creating HNSW Indexes</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="creating-hnsw-indexes"
aria-haspopup="dialog"
aria-label="Share link: Creating HNSW Indexes">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Build an HNSW index on vector properties:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Create</span><span class="w"> </span><span class="py">HNSW</span><span class="w"> </span><span class="py">index</span><span class="w"> </span><span class="py">for</span><span class="w"> </span><span class="py">semantic</span><span class="w"> </span><span class="py">search</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CREATE</span><span class="w"> </span><span class="py">VECTOR</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">document_embeddings</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">FOR</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="p">:</span><span class="nc">Document</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ON</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="err">.</span><span class="py">embedding</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">OPTIONS</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">dimensions</span><span class="p">:</span><span class="w"> </span><span class="nc">768</span><span class="p">,</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">Embedding</span><span class="w"> </span><span class="py">dimensionality</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">similarity</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">cosine</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">cosine</span><span class="p">,</span><span class="w"> </span><span class="py">euclidean</span><span class="p">,</span><span class="w"> </span><span class="py">or</span><span class="w"> </span><span class="py">dot_product</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">m</span><span class="p">:</span><span class="w"> </span><span class="nc">16</span><span class="p">,</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">Connections</span><span class="w"> </span><span class="py">per</span><span class="w"> </span><span class="py">layer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">ef_construction</span><span class="p">:</span><span class="w"> </span><span class="nc">200</span><span class="p">,</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">Build</span><span class="err">-</span><span class="py">time</span><span class="w"> </span><span class="py">search</span><span class="w"> </span><span class="py">depth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">ef_search</span><span class="p">:</span><span class="w"> </span><span class="nc">100</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">Query</span><span class="err">-</span><span class="py">time</span><span class="w"> </span><span class="py">search</span><span class="w"> </span><span class="py">depth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div><p>Parameters:</p>
<ul>
<li><strong>dimensions</strong>: Vector dimensionality (e.g., 768 for BERT, 1536 for OpenAI)</li>
<li><strong>similarity</strong>: Distance metric (cosine, euclidean, dot product)</li>
<li><strong>m</strong>: Number of bi-directional connections per node (trade-off: higher = better accuracy but more memory)</li>
<li><strong>ef_construction</strong>: Size of candidate set during construction (higher = better quality graph)</li>
<li><strong>ef_search</strong>: Size of candidate set during search (higher = better recall but slower)</li>
</ul>
<h4 id="vector-similarity-search" class="position-relative d-flex align-items-center group">
<span>Vector Similarity Search</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="vector-similarity-search"
aria-haspopup="dialog"
aria-label="Share link: Vector Similarity Search">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Query similar vectors:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Find</span><span class="w"> </span><span class="py">10</span><span class="w"> </span><span class="py">most</span><span class="w"> </span><span class="py">similar</span><span class="w"> </span><span class="py">documents</span><span class="w"> </span><span class="py">to</span><span class="w"> </span><span class="py">a</span><span class="w"> </span><span class="kd">query</span><span class="w"> </span><span class="nc">embedding</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="p">:</span><span class="nc">Document</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">d</span><span class="err">.</span><span class="py">embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">)</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">0</span><span class="mf">.7</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">d</span><span class="err">.</span><span class="py">title</span><span class="p">,</span><span class="w"> </span><span class="py">d</span><span class="err">.</span><span class="py">id</span><span class="p">,</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">d</span><span class="err">.</span><span class="py">embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">)</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">similarity</span><span class="w"> </span><span class="py">DESC</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">LIMIT</span><span class="w"> </span><span class="py">10</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Or</span><span class="w"> </span><span class="py">use</span><span class="w"> </span><span class="py">dedicated</span><span class="w"> </span><span class="py">function</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">document_embeddings</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">ef</span><span class="p">:</span><span class="w"> </span><span class="nc">150</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">Override</span><span class="w"> </span><span class="py">default</span><span class="w"> </span><span class="py">ef_search</span><span class="w"> </span><span class="py">for</span><span class="w"> </span><span class="py">this</span><span class="w"> </span><span class="kd">query</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nc">YIELD</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">node</span><span class="err">.</span><span class="py">title</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="combining-vector-and-graph-queries" class="position-relative d-flex align-items-center group">
<span>Combining Vector and Graph Queries</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="combining-vector-and-graph-queries"
aria-haspopup="dialog"
aria-label="Share link: Combining Vector and Graph Queries">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>The real power: integrate vector search with graph traversal:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Find</span><span class="w"> </span><span class="py">similar</span><span class="w"> </span><span class="py">products</span><span class="w"> </span><span class="py">purchased</span><span class="w"> </span><span class="py">by</span><span class="w"> </span><span class="py">friends</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">me</span><span class="p">:</span><span class="nc">User</span><span class="w"> </span><span class="p">{</span><span class="py">id</span><span class="p">:</span><span class="w"> </span><span class="nv">$userId</span><span class="p">})</span><span class="err">-</span><span class="p">[:</span><span class="nc">FRIEND</span><span class="p">]</span><span class="err">-></span><span class="p">(</span><span class="nc">friend</span><span class="p">:</span><span class="nc">User</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="err">-</span><span class="p">[:</span><span class="nc">PURCHASED</span><span class="p">]</span><span class="err">-></span><span class="p">(</span><span class="py">product</span><span class="p">:</span><span class="nc">Product</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">product</span><span class="err">.</span><span class="py">embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">)</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">0</span><span class="mf">.8</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">DISTINCT</span><span class="w"> </span><span class="py">product</span><span class="err">.</span><span class="py">name</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">product</span><span class="err">.</span><span class="py">embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">)</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">similarity</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">COUNT</span><span class="p">(</span><span class="py">DISTINCT</span><span class="w"> </span><span class="py">friend</span><span class="p">)</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">friend_count</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">similarity</span><span class="w"> </span><span class="py">DESC</span><span class="p">,</span><span class="w"> </span><span class="py">friend_count</span><span class="w"> </span><span class="py">DESC</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">LIMIT</span><span class="w"> </span><span class="py">10</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Semantic</span><span class="w"> </span><span class="py">search</span><span class="w"> </span><span class="py">with</span><span class="w"> </span><span class="py">metadata</span><span class="w"> </span><span class="py">filtering</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">doc</span><span class="p">:</span><span class="nc">Document</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">doc</span><span class="err">.</span><span class="py">category</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="err">'</span><span class="py">technical</span><span class="err">'</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">AND</span><span class="w"> </span><span class="py">doc</span><span class="err">.</span><span class="py">publish_date</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">date</span><span class="p">(</span><span class="err">'</span><span class="py">2024</span><span class="err">-</span><span class="py">01</span><span class="err">-</span><span class="py">01</span><span class="err">'</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">AND</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">doc</span><span class="err">.</span><span class="py">embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">)</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">0</span><span class="mf">.75</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">doc</span><span class="err">.</span><span class="py">title</span><span class="p">,</span><span class="w"> </span><span class="py">doc</span><span class="err">.</span><span class="py">author</span><span class="p">,</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">doc</span><span class="err">.</span><span class="py">embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">)</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">score</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">score</span><span class="w"> </span><span class="py">DESC</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">LIMIT</span><span class="w"> </span><span class="py">20</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h3 id="use-cases" class="position-relative d-flex align-items-center group">
<span>Use Cases</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="use-cases"
aria-haspopup="dialog"
aria-label="Share link: Use Cases">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="semantic-search" class="position-relative d-flex align-items-center group">
<span>Semantic Search</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="semantic-search"
aria-haspopup="dialog"
aria-label="Share link: Semantic Search">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Find documents by meaning, not just keywords:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Traditional</span><span class="w"> </span><span class="py">keyword</span><span class="w"> </span><span class="py">search</span><span class="p">:</span><span class="w"> </span><span class="nc">misses</span><span class="w"> </span><span class="py">synonyms</span><span class="p">,</span><span class="w"> </span><span class="py">context</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="p">:</span><span class="nc">Document</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">d</span><span class="err">.</span><span class="py">content</span><span class="w"> </span><span class="py">CONTAINS</span><span class="w"> </span><span class="err">'</span><span class="py">database</span><span class="err">'</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">d</span><span class="err">.</span><span class="py">title</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Semantic</span><span class="w"> </span><span class="py">search</span><span class="p">:</span><span class="w"> </span><span class="nc">understands</span><span class="w"> </span><span class="py">meaning</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">document_embeddings</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$query_embedding</span><span class="p">,</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="nc">Embedding</span><span class="w"> </span><span class="nc">of</span><span class="w"> </span><span class="s">"systems for storing data"</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">node</span><span class="err">.</span><span class="py">title</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Returns</span><span class="w"> </span><span class="py">documents</span><span class="w"> </span><span class="py">about</span><span class="w"> </span><span class="py">databases</span><span class="p">,</span><span class="w"> </span><span class="py">even</span><span class="w"> </span><span class="py">without</span><span class="w"> </span><span class="py">keyword</span><span class="w"> </span><span class="s">"database"</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="recommendation-systems" class="position-relative d-flex align-items-center group">
<span>Recommendation Systems</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="recommendation-systems"
aria-haspopup="dialog"
aria-label="Share link: Recommendation Systems">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Discover similar items:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Content</span><span class="err">-</span><span class="py">based</span><span class="w"> </span><span class="py">recommendations</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">item</span><span class="p">:</span><span class="nc">Product</span><span class="w"> </span><span class="p">{</span><span class="py">id</span><span class="p">:</span><span class="w"> </span><span class="nv">$productId</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nc">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">product_embeddings</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nc">item</span><span class="err">.</span><span class="nc">embedding</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">k</span><span class="p">:</span><span class="w"> </span><span class="nc">50</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">similar_product</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">similar_product</span><span class="err">.</span><span class="py">id</span><span class="w"> </span><span class="err"><></span><span class="w"> </span><span class="nv">$productId</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">AND</span><span class="w"> </span><span class="py">similar_product</span><span class="err">.</span><span class="py">in_stock</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="py">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">similar_product</span><span class="err">.</span><span class="py">name</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">similarity</span><span class="w"> </span><span class="py">DESC</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">LIMIT</span><span class="w"> </span><span class="py">10</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="retrieval-augmented-generation-rag" class="position-relative d-flex align-items-center group">
<span>Retrieval-Augmented Generation (RAG)</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="retrieval-augmented-generation-rag"
aria-haspopup="dialog"
aria-label="Share link: Retrieval-Augmented Generation (RAG)">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Find relevant context for LLM prompts:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Retrieve</span><span class="w"> </span><span class="py">relevant</span><span class="w"> </span><span class="py">context</span><span class="w"> </span><span class="py">for</span><span class="w"> </span><span class="py">RAG</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">knowledge_base</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$question_embedding</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">5</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nc">YIELD</span><span class="w"> </span><span class="py">node</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">doc</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">doc</span><span class="err">.</span><span class="py">content</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">context</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">similarity</span><span class="w"> </span><span class="py">DESC</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div><p>Pass the returned context to your LLM to ground responses in your knowledge base.</p>
<h4 id="anomaly-detection" class="position-relative d-flex align-items-center group">
<span>Anomaly Detection</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="anomaly-detection"
aria-haspopup="dialog"
aria-label="Share link: Anomaly Detection">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Identify outliers in embedding space:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Find</span><span class="w"> </span><span class="py">anomalous</span><span class="w"> </span><span class="py">user</span><span class="w"> </span><span class="py">behavior</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">u</span><span class="p">:</span><span class="nc">User</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WITH</span><span class="w"> </span><span class="py">u</span><span class="p">,</span><span class="w"> </span><span class="py">u</span><span class="err">.</span><span class="py">behavior_embedding</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">embedding</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">user_behavior</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nc">embedding</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WITH</span><span class="w"> </span><span class="py">u</span><span class="p">,</span><span class="w"> </span><span class="py">AVG</span><span class="p">(</span><span class="py">similarity</span><span class="p">)</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">avg_similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">avg_similarity</span><span class="w"> </span><span class="err"><</span><span class="w"> </span><span class="py">0</span><span class="mf">.5</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">Low</span><span class="w"> </span><span class="py">similarity</span><span class="w"> </span><span class="py">to</span><span class="w"> </span><span class="py">nearest</span><span class="w"> </span><span class="py">neighbors</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="py">anomaly</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">u</span><span class="err">.</span><span class="py">id</span><span class="p">,</span><span class="w"> </span><span class="py">u</span><span class="err">.</span><span class="py">name</span><span class="p">,</span><span class="w"> </span><span class="py">avg_similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">avg_similarity</span><span class="w"> </span><span class="py">ASC</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="image-search" class="position-relative d-flex align-items-center group">
<span>Image Search</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="image-search"
aria-haspopup="dialog"
aria-label="Share link: Image Search">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Find visually similar images:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Image</span><span class="w"> </span><span class="py">similarity</span><span class="w"> </span><span class="py">using</span><span class="w"> </span><span class="py">CLIP</span><span class="w"> </span><span class="py">embeddings</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">img</span><span class="p">:</span><span class="nc">Image</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">img</span><span class="err">.</span><span class="py">clip_embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_image_embedding</span><span class="p">)</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">0</span><span class="mf">.85</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">img</span><span class="err">.</span><span class="py">url</span><span class="p">,</span><span class="w"> </span><span class="py">img</span><span class="err">.</span><span class="py">caption</span><span class="p">,</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">img</span><span class="err">.</span><span class="py">clip_embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query_image_embedding</span><span class="p">)</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">similarity</span><span class="w"> </span><span class="py">DESC</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">LIMIT</span><span class="w"> </span><span class="py">20</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h3 id="best-practices" class="position-relative d-flex align-items-center group">
<span>Best Practices</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="best-practices"
aria-haspopup="dialog"
aria-label="Share link: Best Practices">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="choosing-hnsw-parameters" class="position-relative d-flex align-items-center group">
<span>Choosing HNSW Parameters</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="choosing-hnsw-parameters"
aria-haspopup="dialog"
aria-label="Share link: Choosing HNSW Parameters">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Balance accuracy, speed, and memory:</p>
<p><strong>M (connections per layer)</strong>:</p>
<ul>
<li>Low (4-8): Faster search, lower memory, lower recall</li>
<li>Medium (12-24): Balanced (recommended for most cases)</li>
<li>High (32-64): Better recall, more memory, slightly slower</li>
</ul>
<p><strong>ef_construction</strong>:</p>
<ul>
<li>Low (50-100): Faster index build, lower quality graph</li>
<li>Medium (100-200): Balanced (recommended)</li>
<li>High (400-800): Slower build, higher quality graph</li>
</ul>
<p><strong>ef_search</strong>:</p>
<ul>
<li>Low (10-50): Faster search, lower recall</li>
<li>Medium (50-150): Balanced</li>
<li>High (200-500): Better recall, slower search</li>
<li>Can be adjusted per-query based on accuracy requirements</li>
</ul>
<h4 id="embedding-generation" class="position-relative d-flex align-items-center group">
<span>Embedding Generation</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="embedding-generation"
aria-haspopup="dialog"
aria-label="Share link: Embedding Generation">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Use high-quality embeddings:</p>
<ol>
<li>
<p><strong>Choose appropriate model</strong>:</p>
<ul>
<li>Text: BERT, RoBERTa, sentence-transformers, OpenAI text-embedding-3</li>
<li>Images: CLIP, ResNet, EfficientNet</li>
<li>Multimodal: CLIP, ALIGN</li>
</ul>
</li>
<li>
<p><strong>Normalize embeddings</strong>: For cosine similarity, normalize to unit length</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">embedding</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="n">embedding</span> <span class="o">=</span> <span class="n">embedding</span> <span class="o">/</span> <span class="n">np</span><span class="o">.</span><span class="n">linalg</span><span class="o">.</span><span class="n">norm</span><span class="p">(</span><span class="n">embedding</span><span class="p">)</span>
</span></span></code></pre></div></li>
<li>
<p><strong>Use consistent dimensions</strong>: All vectors in an index must have same dimensionality</p>
</li>
</ol>
<h4 id="index-maintenance" class="position-relative d-flex align-items-center group">
<span>Index Maintenance</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="index-maintenance"
aria-haspopup="dialog"
aria-label="Share link: Index Maintenance">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p><strong>Incremental updates</strong>: Add vectors as needed</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Add</span><span class="w"> </span><span class="py">new</span><span class="w"> </span><span class="py">document</span><span class="w"> </span><span class="py">to</span><span class="w"> </span><span class="py">existing</span><span class="w"> </span><span class="py">index</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">INSERT</span><span class="w"> </span><span class="p">(:</span><span class="nc">Document</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">id</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">new</span><span class="err">-</span><span class="py">doc</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">title</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">New</span><span class="w"> </span><span class="py">Article</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">embedding</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="kd">...</span><span class="p">]</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="nc">Automatically</span><span class="w"> </span><span class="py">added</span><span class="w"> </span><span class="py">to</span><span class="w"> </span><span class="py">index</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div><p><strong>Rebuild for better quality</strong>: Periodically rebuild for optimal graph structure</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Rebuild</span><span class="w"> </span><span class="py">index</span><span class="w"> </span><span class="py">with</span><span class="w"> </span><span class="py">new</span><span class="w"> </span><span class="py">parameters</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">DROP</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">document_embeddings</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CREATE</span><span class="w"> </span><span class="py">VECTOR</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">document_embeddings</span><span class="w"> </span><span class="py">FOR</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="p">:</span><span class="nc">Document</span><span class="p">)</span><span class="w"> </span><span class="py">ON</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="err">.</span><span class="py">embedding</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">OPTIONS</span><span class="w"> </span><span class="p">{</span><span class="py">m</span><span class="p">:</span><span class="w"> </span><span class="nc">20</span><span class="p">,</span><span class="w"> </span><span class="py">ef_construction</span><span class="p">:</span><span class="w"> </span><span class="nc">300</span><span class="p">}</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="query-optimization" class="position-relative d-flex align-items-center group">
<span>Query Optimization</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="query-optimization"
aria-haspopup="dialog"
aria-label="Share link: Query Optimization">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p><strong>Pre-filter when possible</strong>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Efficient</span><span class="p">:</span><span class="w"> </span><span class="nc">Filter</span><span class="w"> </span><span class="py">before</span><span class="w"> </span><span class="py">vector</span><span class="w"> </span><span class="py">search</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="p">:</span><span class="nc">Document</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">d</span><span class="err">.</span><span class="py">category</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="err">'</span><span class="py">science</span><span class="err">'</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">AND</span><span class="w"> </span><span class="py">d</span><span class="err">.</span><span class="py">year</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">2020</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">AND</span><span class="w"> </span><span class="py">vector_similarity</span><span class="p">(</span><span class="py">d</span><span class="err">.</span><span class="py">embedding</span><span class="p">,</span><span class="w"> </span><span class="nv">$query</span><span class="p">)</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">0</span><span class="mf">.8</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">d</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Less</span><span class="w"> </span><span class="py">efficient</span><span class="p">:</span><span class="w"> </span><span class="nc">Vector</span><span class="w"> </span><span class="py">search</span><span class="w"> </span><span class="py">then</span><span class="w"> </span><span class="py">filter</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="kd">...</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">node</span><span class="err">.</span><span class="py">category</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="err">'</span><span class="py">science</span><span class="err">'</span><span class="w"> </span><span class="err">--</span><span class="w"> </span><span class="py">Post</span><span class="err">-</span><span class="py">filter</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">node</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div><p><strong>Adjust ef_search for accuracy needs</strong>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">High</span><span class="err">-</span><span class="py">recall</span><span class="w"> </span><span class="py">search</span><span class="w"> </span><span class="p">(</span><span class="py">slower</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="kd">...</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="nc">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$q</span><span class="p">,</span><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="p">,</span><span class="w"> </span><span class="py">ef</span><span class="p">:</span><span class="w"> </span><span class="nc">500</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Fast</span><span class="w"> </span><span class="py">search</span><span class="w"> </span><span class="p">(</span><span class="py">may</span><span class="w"> </span><span class="py">miss</span><span class="w"> </span><span class="py">some</span><span class="w"> </span><span class="py">results</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="kd">...</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="nc">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$q</span><span class="p">,</span><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="p">,</span><span class="w"> </span><span class="py">ef</span><span class="p">:</span><span class="w"> </span><span class="nc">50</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h3 id="performance-characteristics" class="position-relative d-flex align-items-center group">
<span>Performance Characteristics</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="performance-characteristics"
aria-haspopup="dialog"
aria-label="Share link: Performance Characteristics">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="time-complexity" class="position-relative d-flex align-items-center group">
<span>Time Complexity</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="time-complexity"
aria-haspopup="dialog"
aria-label="Share link: Time Complexity">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><ul>
<li><strong>Construction</strong>: O(N log N) expected</li>
<li><strong>Search</strong>: O(log N) expected</li>
<li><strong>Insertion</strong>: O(log N) expected</li>
<li><strong>Deletion</strong>: O(log N) expected</li>
</ul>
<h4 id="memory-usage" class="position-relative d-flex align-items-center group">
<span>Memory Usage</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="memory-usage"
aria-haspopup="dialog"
aria-label="Share link: Memory Usage">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">Memory = N * (dimensions * 4 bytes + M * 8 bytes per layer)
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Example (1M vectors, 768 dims, M=16):
</span></span><span class="line"><span class="cl">= 1M * (768 * 4 + 16 * 8 * 2.5 layers)
</span></span><span class="line"><span class="cl">= 1M * (3,072 + 320)
</span></span><span class="line"><span class="cl">= 3.4 GB
</span></span></code></pre></div>
<h4 id="performance-benchmarks" class="position-relative d-flex align-items-center group">
<span>Performance Benchmarks</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="performance-benchmarks"
aria-haspopup="dialog"
aria-label="Share link: Performance Benchmarks">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Typical performance (10k vectors, 10-NN):</p>
<ul>
<li><strong>Latency</strong>: 1-5ms at ~90% recall</li>
<li><strong>Recall@10</strong>: ~90% (parameter dependent)</li>
<li><strong>Notes</strong>: Performance varies with ef_search, vector dimensions, and hardware</li>
</ul>
<h3 id="monitoring-and-troubleshooting" class="position-relative d-flex align-items-center group">
<span>Monitoring and Troubleshooting</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="monitoring-and-troubleshooting"
aria-haspopup="dialog"
aria-label="Share link: Monitoring and Troubleshooting">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="index-statistics" class="position-relative d-flex align-items-center group">
<span>Index Statistics</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="index-statistics"
aria-haspopup="dialog"
aria-label="Share link: Index Statistics">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Check</span><span class="w"> </span><span class="py">index</span><span class="w"> </span><span class="py">statistics</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">index</span><span class="err">.</span><span class="py">stats</span><span class="p">(</span><span class="err">'</span><span class="py">document_embeddings</span><span class="err">'</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">vectors</span><span class="p">,</span><span class="w"> </span><span class="py">memory_mb</span><span class="p">,</span><span class="w"> </span><span class="py">avg_connections</span><span class="p">,</span><span class="w"> </span><span class="py">max_layer</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">vectors</span><span class="p">,</span><span class="w"> </span><span class="py">memory_mb</span><span class="p">,</span><span class="w"> </span><span class="py">avg_connections</span><span class="p">,</span><span class="w"> </span><span class="py">max_layer</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="query-performance" class="position-relative d-flex align-items-center group">
<span>Query Performance</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="query-performance"
aria-haspopup="dialog"
aria-label="Share link: Query Performance">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Profile</span><span class="w"> </span><span class="py">vector</span><span class="w"> </span><span class="kd">query</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nc">PROFILE</span><span class="w"> </span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="kd">...</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="nc">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$q</span><span class="p">,</span><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="common-issues" class="position-relative d-flex align-items-center group">
<span>Common Issues</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="common-issues"
aria-haspopup="dialog"
aria-label="Share link: Common Issues">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p><strong>Low recall</strong>: Increase ef_search or rebuild with higher M/ef_construction</p>
<p><strong>High latency</strong>: Decrease ef_search or optimize pre-filtering</p>
<p><strong>Out of memory</strong>: Reduce M, use smaller embeddings, or partition data</p>
<p><strong>Slow indexing</strong>: Reduce ef_construction or batch inserts</p>
<h3 id="related-topics" class="position-relative d-flex align-items-center group">
<span>Related Topics</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="related-topics"
aria-haspopup="dialog"
aria-label="Share link: Related Topics">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3><ul>
<li><a
href="/tags/vector-search/"
>Vector Search</a>
- General vector search capabilities</li>
<li><a
href="/tags/machine-learning/"
>Machine Learning</a>
- ML integration</li>
<li><a
href="/tags/embeddings/"
>Embeddings</a>
- Working with embeddings</li>
<li><a
href="/tags/search/"
>Semantic Search</a>
- Semantic search applications</li>
<li><a
href="/tags/recommendations/"
>Recommendation Systems</a>
- Building recommenders</li>
<li><a
href="/tags/bm25/"
>BM25 Ranking</a>
- Traditional text search ranking</li>
</ul>
<h3 id="further-reading" class="position-relative d-flex align-items-center group">
<span>Further Reading</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="further-reading"
aria-haspopup="dialog"
aria-label="Share link: Further Reading">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3><ul>
<li><a
href="/tags/vector-search/"
>Vector Search</a>
- Complete vector search documentation</li>
<li><a
href="/tags/embeddings/"
>Embeddings</a>
- Working with embeddings</li>
<li><a
href="/docs/performance/"
>Performance</a>
- Performance optimization</li>
<li><a
href="/tags/ai/"
>AI Integration</a>
- AI and machine learning integration</li>
<li><a
href="https://arxiv.org/abs/1603.09320"
aria-label="Original HNSW Paper – opens in new window"
target="_blank" rel="noopener noreferrer"
>Original HNSW Paper
<span aria-hidden="true" class="external-icon">↗</span>
</a>
- Academic paper</li>
</ul>
<p>Geode’s HNSW implementation brings vector similarity search to graph databases, enabling AI applications that combine semantic understanding with graph relationships.</p>
<h3 id="advanced-hnsw-implementation-details" class="position-relative d-flex align-items-center group">
<span>Advanced HNSW Implementation Details</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="advanced-hnsw-implementation-details"
aria-haspopup="dialog"
aria-label="Share link: Advanced HNSW Implementation Details">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="layer-assignment-probability" class="position-relative d-flex align-items-center group">
<span>Layer Assignment Probability</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="layer-assignment-probability"
aria-haspopup="dialog"
aria-label="Share link: Layer Assignment Probability">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>HNSW layers are assigned using exponential decay:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">P(layer = l) = (1/M)^l
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">For M = 4:
</span></span><span class="line"><span class="cl">- Layer 0: 100% of nodes
</span></span><span class="line"><span class="cl">- Layer 1: 25% of nodes
</span></span><span class="line"><span class="cl">- Layer 2: 6.25% of nodes
</span></span><span class="line"><span class="cl">- Layer 3: 1.56% of nodes
</span></span></code></pre></div>
<h4 id="connection-strategy" class="position-relative d-flex align-items-center group">
<span>Connection Strategy</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="connection-strategy"
aria-haspopup="dialog"
aria-label="Share link: Connection Strategy">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Each node maintains:</p>
<ul>
<li><strong>M connections</strong> at layers > 0</li>
<li><strong>2M connections</strong> at layer 0 (base layer for higher recall)</li>
</ul>
<p><strong>Heuristic selection</strong>: Choose neighbors that maximize navigability:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">score(candidate) = distance(query, candidate) - distance(query, current_best)
</span></span></code></pre></div>
<h3 id="query-optimization-strategies" class="position-relative d-flex align-items-center group">
<span>Query Optimization Strategies</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="query-optimization-strategies"
aria-haspopup="dialog"
aria-label="Share link: Query Optimization Strategies">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="adaptive-ef_search" class="position-relative d-flex align-items-center group">
<span>Adaptive ef_search</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="adaptive-ef_search"
aria-haspopup="dialog"
aria-label="Share link: Adaptive ef_search">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Dynamically adjust search effort:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">High</span><span class="err">-</span><span class="py">stakes</span><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nc">maximize</span><span class="w"> </span><span class="nc">recall</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">critical_docs</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$query</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">ef</span><span class="p">:</span><span class="w"> </span><span class="nc">500</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="py">Deep</span><span class="w"> </span><span class="py">search</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Batch</span><span class="w"> </span><span class="py">processing</span><span class="p">:</span><span class="w"> </span><span class="nc">optimize</span><span class="w"> </span><span class="py">throughput</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">product_catalog</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$query</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">k</span><span class="p">:</span><span class="w"> </span><span class="nc">10</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nc">ef</span><span class="p">:</span><span class="w"> </span><span class="nc">32</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="py">Fast</span><span class="w"> </span><span class="py">approximate</span><span class="w"> </span><span class="py">search</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">node</span><span class="p">,</span><span class="w"> </span><span class="py">similarity</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="early-termination" class="position-relative d-flex align-items-center group">
<span>Early Termination</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="early-termination"
aria-haspopup="dialog"
aria-label="Share link: Early Termination">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Stop search when confidence is high:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="k">async</span> <span class="k">def</span> <span class="nf">adaptive_vector_search</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">query_emb</span><span class="p">,</span> <span class="n">min_confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span>
</span></span><span class="line"><span class="cl"> <span class="k">for</span> <span class="n">ef</span> <span class="ow">in</span> <span class="p">[</span><span class="mi">32</span><span class="p">,</span> <span class="mi">64</span><span class="p">,</span> <span class="mi">128</span><span class="p">,</span> <span class="mi">256</span><span class="p">,</span> <span class="mi">512</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl"> <span class="n">results</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="k">await</span> <span class="n">client</span><span class="o">.</span><span class="n">query</span><span class="p">(</span><span class="s2">"""
</span></span></span><span class="line"><span class="cl"><span class="s2"> CALL vector.search({
</span></span></span><span class="line"><span class="cl"><span class="s2"> index: 'embeddings',
</span></span></span><span class="line"><span class="cl"><span class="s2"> query: $query,
</span></span></span><span class="line"><span class="cl"><span class="s2"> k: 10,
</span></span></span><span class="line"><span class="cl"><span class="s2"> ef: $ef
</span></span></span><span class="line"><span class="cl"><span class="s2"> })
</span></span></span><span class="line"><span class="cl"><span class="s2"> YIELD node, similarity
</span></span></span><span class="line"><span class="cl"><span class="s2"> RETURN node, similarity
</span></span></span><span class="line"><span class="cl"><span class="s2"> ORDER BY similarity DESC
</span></span></span><span class="line"><span class="cl"><span class="s2"> """</span><span class="p">,</span> <span class="p">{</span><span class="s2">"query"</span><span class="p">:</span> <span class="n">query_emb</span><span class="p">,</span> <span class="s2">"ef"</span><span class="p">:</span> <span class="n">ef</span><span class="p">})</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="n">top_score</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="n">rows</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="s1">'similarity'</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"> <span class="k">if</span> <span class="n">top_score</span> <span class="o">>=</span> <span class="n">min_confidence</span><span class="p">:</span>
</span></span><span class="line"><span class="cl"> <span class="k">return</span> <span class="n">results</span><span class="o">.</span><span class="n">rows</span> <span class="c1"># Early termination</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">return</span> <span class="n">results</span><span class="o">.</span><span class="n">rows</span> <span class="c1"># Max effort reached</span>
</span></span></code></pre></div>
<h3 id="index-construction-strategies" class="position-relative d-flex align-items-center group">
<span>Index Construction Strategies</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="index-construction-strategies"
aria-haspopup="dialog"
aria-label="Share link: Index Construction Strategies">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="bulk-loading-optimization" class="position-relative d-flex align-items-center group">
<span>Bulk Loading Optimization</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="bulk-loading-optimization"
aria-haspopup="dialog"
aria-label="Share link: Bulk Loading Optimization">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Build index from sorted data:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Sort</span><span class="w"> </span><span class="py">nodes</span><span class="w"> </span><span class="py">by</span><span class="w"> </span><span class="py">degree</span><span class="w"> </span><span class="p">(</span><span class="py">high</span><span class="err">-</span><span class="py">degree</span><span class="w"> </span><span class="py">nodes</span><span class="w"> </span><span class="py">first</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">n</span><span class="p">:</span><span class="nc">Node</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WITH</span><span class="w"> </span><span class="py">n</span><span class="p">,</span><span class="w"> </span><span class="py">SIZE</span><span class="p">((</span><span class="py">n</span><span class="p">)</span><span class="err">-</span><span class="p">[:</span><span class="nc">RELATED</span><span class="p">]</span><span class="err">-</span><span class="p">())</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">degree</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">ORDER</span><span class="w"> </span><span class="py">BY</span><span class="w"> </span><span class="py">degree</span><span class="w"> </span><span class="py">DESC</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Insert</span><span class="w"> </span><span class="py">in</span><span class="w"> </span><span class="py">batches</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">WITH</span><span class="w"> </span><span class="py">n</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">SET</span><span class="w"> </span><span class="py">n</span><span class="err">.</span><span class="py">embedding</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="nv">$computed_embedding</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">}</span><span class="w"> </span><span class="py">IN</span><span class="w"> </span><span class="py">TRANSACTIONS</span><span class="w"> </span><span class="py">OF</span><span class="w"> </span><span class="py">10000</span><span class="w"> </span><span class="py">ROWS</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Rebuild</span><span class="w"> </span><span class="py">index</span><span class="w"> </span><span class="py">after</span><span class="w"> </span><span class="py">bulk</span><span class="w"> </span><span class="py">load</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">DROP</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">embeddings_idx</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CREATE</span><span class="w"> </span><span class="py">VECTOR</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">embeddings_idx</span><span class="w"> </span><span class="py">FOR</span><span class="w"> </span><span class="p">(</span><span class="py">n</span><span class="p">:</span><span class="nc">Node</span><span class="p">)</span><span class="w"> </span><span class="py">ON</span><span class="w"> </span><span class="p">(</span><span class="py">n</span><span class="err">.</span><span class="py">embedding</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">OPTIONS</span><span class="w"> </span><span class="p">{</span><span class="py">m</span><span class="p">:</span><span class="w"> </span><span class="nc">16</span><span class="p">,</span><span class="w"> </span><span class="py">ef_construction</span><span class="p">:</span><span class="w"> </span><span class="nc">200</span><span class="p">}</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="incremental-index-maintenance" class="position-relative d-flex align-items-center group">
<span>Incremental Index Maintenance</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="incremental-index-maintenance"
aria-haspopup="dialog"
aria-label="Share link: Incremental Index Maintenance">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Track</span><span class="w"> </span><span class="py">index</span><span class="w"> </span><span class="py">quality</span><span class="w"> </span><span class="py">metric</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">stats</span><span class="p">:</span><span class="nc">IndexStats</span><span class="w"> </span><span class="p">{</span><span class="py">index_name</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">embeddings_idx</span><span class="err">'</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WITH</span><span class="w"> </span><span class="py">stats</span><span class="err">.</span><span class="py">inserts_since_rebuild</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">inserts</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">stats</span><span class="err">.</span><span class="py">total_vectors</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">total</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WITH</span><span class="w"> </span><span class="py">inserts</span><span class="w"> </span><span class="err">*</span><span class="w"> </span><span class="py">1</span><span class="mf">.0</span><span class="w"> </span><span class="err">/</span><span class="w"> </span><span class="py">total</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">insert_ratio</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">insert_ratio</span><span class="w"> </span><span class="err">></span><span class="w"> </span><span class="py">0</span><span class="mf">.1</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="py">10</span><span class="err">%</span><span class="w"> </span><span class="py">growth</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Trigger</span><span class="w"> </span><span class="py">rebuild</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">index</span><span class="err">.</span><span class="py">rebuild</span><span class="p">(</span><span class="err">'</span><span class="py">embeddings_idx</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">m</span><span class="p">:</span><span class="w"> </span><span class="nc">20</span><span class="p">,</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="py">Increase</span><span class="w"> </span><span class="py">connections</span><span class="w"> </span><span class="py">for</span><span class="w"> </span><span class="py">larger</span><span class="w"> </span><span class="py">graph</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">ef_construction</span><span class="p">:</span><span class="w"> </span><span class="nc">300</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h3 id="distance-metrics-deep-dive" class="position-relative d-flex align-items-center group">
<span>Distance Metrics Deep Dive</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="distance-metrics-deep-dive"
aria-haspopup="dialog"
aria-label="Share link: Distance Metrics Deep Dive">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="cosine-similarity" class="position-relative d-flex align-items-center group">
<span>Cosine Similarity</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="cosine-similarity"
aria-haspopup="dialog"
aria-label="Share link: Cosine Similarity">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Best for normalized vectors:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">cosine(u, v) = (u · v) / (||u|| × ||v||)
</span></span><span class="line"><span class="cl"> = Σ(ui × vi) / sqrt(Σui²) × sqrt(Σvi²)
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Range: [-1, 1]
</span></span><span class="line"><span class="cl">- 1: Identical direction
</span></span><span class="line"><span class="cl">- 0: Orthogonal
</span></span><span class="line"><span class="cl">- -1: Opposite direction
</span></span></code></pre></div><p><strong>Optimization</strong>: Pre-normalize vectors to unit length, then use dot product:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Pre</span><span class="err">-</span><span class="py">normalize</span><span class="w"> </span><span class="py">at</span><span class="w"> </span><span class="py">insertion</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">MATCH</span><span class="w"> </span><span class="p">(</span><span class="py">n</span><span class="p">:</span><span class="nc">Node</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">SET</span><span class="w"> </span><span class="py">n</span><span class="err">.</span><span class="py">embedding</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">normalize</span><span class="p">(</span><span class="py">n</span><span class="err">.</span><span class="py">embedding</span><span class="p">)</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Use</span><span class="w"> </span><span class="py">dot</span><span class="w"> </span><span class="py">product</span><span class="w"> </span><span class="p">(</span><span class="py">equivalent</span><span class="w"> </span><span class="py">to</span><span class="w"> </span><span class="py">cosine</span><span class="w"> </span><span class="py">for</span><span class="w"> </span><span class="py">normalized</span><span class="w"> </span><span class="py">vectors</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">normalized_embeddings</span><span class="err">'</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nc">vector</span><span class="err">.</span><span class="nc">normalize</span><span class="p">(</span><span class="nv">$query</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="py">metric</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">dot_product</span><span class="err">'</span><span class="w"> </span><span class="err">//</span><span class="w"> </span><span class="py">Faster</span><span class="w"> </span><span class="py">than</span><span class="w"> </span><span class="py">cosine</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="p">})</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="euclidean-distance-l2" class="position-relative d-flex align-items-center group">
<span>Euclidean Distance (L2)</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="euclidean-distance-l2"
aria-haspopup="dialog"
aria-label="Share link: Euclidean Distance (L2)">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Measures absolute distance:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">euclidean(u, v) = sqrt(Σ(ui - vi)²)
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Properties:
</span></span><span class="line"><span class="cl">- Sensitive to magnitude
</span></span><span class="line"><span class="cl">- Triangle inequality holds
</span></span><span class="line"><span class="cl">- Metric space properties
</span></span></code></pre></div>
<h4 id="manhattan-distance-l1" class="position-relative d-flex align-items-center group">
<span>Manhattan Distance (L1)</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="manhattan-distance-l1"
aria-haspopup="dialog"
aria-label="Share link: Manhattan Distance (L1)">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Sum of absolute differences:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">manhattan(u, v) = Σ|ui - vi|
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">Use cases:
</span></span><span class="line"><span class="cl">- Sparse vectors
</span></span><span class="line"><span class="cl">- Grid-based distances
</span></span><span class="line"><span class="cl">- Outlier-robust similarity
</span></span></code></pre></div>
<h3 id="production-deployment-patterns" class="position-relative d-flex align-items-center group">
<span>Production Deployment Patterns</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="production-deployment-patterns"
aria-haspopup="dialog"
aria-label="Share link: Production Deployment Patterns">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="multi-index-strategy" class="position-relative d-flex align-items-center group">
<span>Multi-Index Strategy</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="multi-index-strategy"
aria-haspopup="dialog"
aria-label="Share link: Multi-Index Strategy">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><p>Separate indexes for different embedding types:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Text</span><span class="w"> </span><span class="py">embeddings</span><span class="w"> </span><span class="p">(</span><span class="py">768d</span><span class="p">,</span><span class="w"> </span><span class="py">BERT</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CREATE</span><span class="w"> </span><span class="py">VECTOR</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">text_embeddings</span><span class="w"> </span><span class="py">FOR</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="p">:</span><span class="nc">Document</span><span class="p">)</span><span class="w"> </span><span class="py">ON</span><span class="w"> </span><span class="p">(</span><span class="py">d</span><span class="err">.</span><span class="py">text_embedding</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">OPTIONS</span><span class="w"> </span><span class="p">{</span><span class="py">dimensions</span><span class="p">:</span><span class="w"> </span><span class="nc">768</span><span class="p">,</span><span class="w"> </span><span class="py">metric</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">cosine</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="py">m</span><span class="p">:</span><span class="w"> </span><span class="nc">16</span><span class="p">}</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Image</span><span class="w"> </span><span class="py">embeddings</span><span class="w"> </span><span class="p">(</span><span class="py">512d</span><span class="p">,</span><span class="w"> </span><span class="py">CLIP</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CREATE</span><span class="w"> </span><span class="py">VECTOR</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">image_embeddings</span><span class="w"> </span><span class="py">FOR</span><span class="w"> </span><span class="p">(</span><span class="py">p</span><span class="p">:</span><span class="nc">Product</span><span class="p">)</span><span class="w"> </span><span class="py">ON</span><span class="w"> </span><span class="p">(</span><span class="py">p</span><span class="err">.</span><span class="py">image_embedding</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">OPTIONS</span><span class="w"> </span><span class="p">{</span><span class="py">dimensions</span><span class="p">:</span><span class="w"> </span><span class="nc">512</span><span class="p">,</span><span class="w"> </span><span class="py">metric</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">cosine</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="py">m</span><span class="p">:</span><span class="w"> </span><span class="nc">20</span><span class="p">}</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">User</span><span class="w"> </span><span class="py">embeddings</span><span class="w"> </span><span class="p">(</span><span class="py">128d</span><span class="p">,</span><span class="w"> </span><span class="py">custom</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CREATE</span><span class="w"> </span><span class="py">VECTOR</span><span class="w"> </span><span class="py">INDEX</span><span class="w"> </span><span class="py">user_embeddings</span><span class="w"> </span><span class="py">FOR</span><span class="w"> </span><span class="p">(</span><span class="py">u</span><span class="p">:</span><span class="nc">User</span><span class="p">)</span><span class="w"> </span><span class="py">ON</span><span class="w"> </span><span class="p">(</span><span class="py">u</span><span class="err">.</span><span class="py">behavior_embedding</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">OPTIONS</span><span class="w"> </span><span class="p">{</span><span class="py">dimensions</span><span class="p">:</span><span class="w"> </span><span class="nc">128</span><span class="p">,</span><span class="w"> </span><span class="py">metric</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">euclidean</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="py">m</span><span class="p">:</span><span class="w"> </span><span class="nc">12</span><span class="p">}</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Query</span><span class="w"> </span><span class="py">appropriate</span><span class="w"> </span><span class="py">index</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">search</span><span class="p">({</span><span class="py">index</span><span class="p">:</span><span class="w"> </span><span class="err">'</span><span class="nc">text_embeddings</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="kd">query</span><span class="p">:</span><span class="w"> </span><span class="nv">$text_query</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nc">YIELD</span><span class="w"> </span><span class="nc">node</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h4 id="backup-and-recovery" class="position-relative d-flex align-items-center group">
<span>Backup and Recovery</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="backup-and-recovery"
aria-haspopup="dialog"
aria-label="Share link: Backup and Recovery">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-gql" data-lang="gql"><span class="line"><span class="cl"><span class="err">--</span><span class="w"> </span><span class="py">Export</span><span class="w"> </span><span class="py">index</span><span class="w"> </span><span class="py">state</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">index</span><span class="err">.</span><span class="py">export</span><span class="p">(</span><span class="err">'</span><span class="py">embeddings_idx</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="err">'/</span><span class="py">backup</span><span class="err">/</span><span class="py">embeddings_idx_20250124</span><span class="err">.</span><span class="py">bin</span><span class="err">'</span><span class="p">)</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Restore</span><span class="w"> </span><span class="py">from</span><span class="w"> </span><span class="py">backup</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">index</span><span class="err">.</span><span class="py">import</span><span class="p">(</span><span class="err">'</span><span class="py">embeddings_idx</span><span class="err">'</span><span class="p">,</span><span class="w"> </span><span class="err">'/</span><span class="py">backup</span><span class="err">/</span><span class="py">embeddings_idx_20250124</span><span class="err">.</span><span class="py">bin</span><span class="err">'</span><span class="p">)</span><span class="err">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="err">--</span><span class="w"> </span><span class="py">Verify</span><span class="w"> </span><span class="py">index</span><span class="w"> </span><span class="py">integrity</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">CALL</span><span class="w"> </span><span class="py">vector</span><span class="err">.</span><span class="py">index</span><span class="err">.</span><span class="py">verify</span><span class="p">(</span><span class="err">'</span><span class="py">embeddings_idx</span><span class="err">'</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">YIELD</span><span class="w"> </span><span class="py">vectors</span><span class="p">,</span><span class="w"> </span><span class="py">corrupted_entries</span><span class="p">,</span><span class="w"> </span><span class="py">avg_degree</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">WHERE</span><span class="w"> </span><span class="py">corrupted_entries</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="py">0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="py">RETURN</span><span class="w"> </span><span class="err">'</span><span class="py">Index</span><span class="w"> </span><span class="py">healthy</span><span class="err">'</span><span class="w"> </span><span class="py">AS</span><span class="w"> </span><span class="py">status</span><span class="err">;</span><span class="w">
</span></span></span></code></pre></div>
<h3 id="benchmarking-and-performance" class="position-relative d-flex align-items-center group">
<span>Benchmarking and Performance</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="benchmarking-and-performance"
aria-haspopup="dialog"
aria-label="Share link: Benchmarking and Performance">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3>
<h4 id="recall-measurement" class="position-relative d-flex align-items-center group">
<span>Recall Measurement</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="recall-measurement"
aria-haspopup="dialog"
aria-label="Share link: Recall Measurement">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="k">async</span> <span class="k">def</span> <span class="nf">measure_recall</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">test_queries</span><span class="p">,</span> <span class="n">ground_truth</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">10</span><span class="p">):</span>
</span></span><span class="line"><span class="cl"> <span class="n">recalls</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">for</span> <span class="n">query_emb</span><span class="p">,</span> <span class="n">true_neighbors</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">test_queries</span><span class="p">,</span> <span class="n">ground_truth</span><span class="p">):</span>
</span></span><span class="line"><span class="cl"> <span class="c1"># HNSW approximate search</span>
</span></span><span class="line"><span class="cl"> <span class="n">results</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="k">await</span> <span class="n">client</span><span class="o">.</span><span class="n">query</span><span class="p">(</span><span class="s2">"""
</span></span></span><span class="line"><span class="cl"><span class="s2"> CALL vector.search({
</span></span></span><span class="line"><span class="cl"><span class="s2"> index: 'test_index',
</span></span></span><span class="line"><span class="cl"><span class="s2"> query: $query,
</span></span></span><span class="line"><span class="cl"><span class="s2"> k: $k
</span></span></span><span class="line"><span class="cl"><span class="s2"> })
</span></span></span><span class="line"><span class="cl"><span class="s2"> YIELD node
</span></span></span><span class="line"><span class="cl"><span class="s2"> RETURN node.id AS id
</span></span></span><span class="line"><span class="cl"><span class="s2"> """</span><span class="p">,</span> <span class="p">{</span><span class="s2">"query"</span><span class="p">:</span> <span class="n">query_emb</span><span class="p">,</span> <span class="s2">"k"</span><span class="p">:</span> <span class="n">k</span><span class="p">})</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="n">retrieved</span> <span class="o">=</span> <span class="p">{</span><span class="n">row</span><span class="p">[</span><span class="s1">'id'</span><span class="p">]</span> <span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">results</span><span class="o">.</span><span class="n">rows</span><span class="p">}</span>
</span></span><span class="line"><span class="cl"> <span class="n">relevant</span> <span class="o">=</span> <span class="nb">set</span><span class="p">(</span><span class="n">true_neighbors</span><span class="p">[:</span><span class="n">k</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="n">recall</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">retrieved</span> <span class="o">&</span> <span class="n">relevant</span><span class="p">)</span> <span class="o">/</span> <span class="n">k</span>
</span></span><span class="line"><span class="cl"> <span class="n">recalls</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">recall</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"> <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">recalls</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Typical results:</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ef_search=50, M=16: recall@10 ≈ 0.93, latency ≈ 2ms</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ef_search=100, M=16: recall@10 ≈ 0.97, latency ≈ 5ms</span>
</span></span><span class="line"><span class="cl"><span class="c1"># ef_search=200, M=32: recall@10 ≈ 0.99, latency ≈ 15ms</span>
</span></span></code></pre></div>
<h4 id="throughput-vs-latency-trade-offs" class="position-relative d-flex align-items-center group">
<span>Throughput vs Latency Trade-offs</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="throughput-vs-latency-trade-offs"
aria-haspopup="dialog"
aria-label="Share link: Throughput vs Latency Trade-offs">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="c1"># Latency-optimized (single query)</span>
</span></span><span class="line"><span class="cl"><span class="n">config_latency</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"m"</span><span class="p">:</span> <span class="mi">12</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"ef_construction"</span><span class="p">:</span> <span class="mi">100</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"ef_search"</span><span class="p">:</span> <span class="mi">32</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"batch_size"</span><span class="p">:</span> <span class="mi">1</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="c1"># Latency and recall depend on dataset, dimensions, and ef_search</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Throughput-optimized (batch queries)</span>
</span></span><span class="line"><span class="cl"><span class="n">config_throughput</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"m"</span><span class="p">:</span> <span class="mi">16</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"ef_construction"</span><span class="p">:</span> <span class="mi">200</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"ef_search"</span><span class="p">:</span> <span class="mi">64</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"> <span class="s2">"batch_size"</span><span class="p">:</span> <span class="mi">100</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="c1"># Throughput and recall depend on dataset, dimensions, and hardware</span>
</span></span></code></pre></div>
<h3 id="further-reading-1" class="position-relative d-flex align-items-center group">
<span>Further Reading</span>
<button type="button"
class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1"
data-share-target="further-reading-1"
aria-haspopup="dialog"
aria-label="Share link: Further Reading">
<i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i>
<span class="visually-hidden">Share link</span>
</button>
</h3><ul>
<li><strong>HNSW Original Paper</strong>: Malkov & Yashunin (2016) - Efficient and Robust ANN Search</li>
<li><strong>Graph-Based ANN</strong>: NSW, HNSW, NSG, and DiskANN Algorithms</li>
<li><strong>Distance Metrics</strong>: Cosine, Euclidean, Inner Product, and Custom Metrics</li>
<li><strong>Index Tuning</strong>: M, ef_construction, ef_search Parameter Optimization</li>
<li><strong>Production Systems</strong>: Scaling to Billions of Vectors</li>
</ul>
<p>Browse tagged content for comprehensive HNSW and vector search documentation.</p>
Tag
1 article
Tag: Hierarchical Navigable Small World (HNSW)
Explore Geode documentation tagged with hnsw. Learn about hnsw features, best practices, and implementation details.