Index Types Reference

Geode provides six specialized index families optimized for different access patterns. This reference documents all index types, their configuration options, and performance characteristics.

Index Overview

Index TypeBest ForComplexityStorage Overhead
B-treeRange queries, sortingO(log N)Low-Medium
HashExact equalityO(1)Low
Full-textText search with rankingO(log N)Medium-High
SpatialGeographic queriesO(log N)Medium
Vector (HNSW)Similarity searchO(log N)High
CompositeMulti-property queriesO(log N)Medium

B-tree Index

B-tree indexes are the default index type, providing balanced performance for range queries and equality lookups.

Syntax

CREATE INDEX [index_name] ON :Label(property) USING BTREE;
CREATE INDEX [index_name] ON :Label(property);  -- BTREE is default

Configuration Options

CREATE INDEX person_age_idx ON :Person(age) USING BTREE
  OPTIONS {
    fill_factor: 0.9,        -- Page fill factor (default: 0.9)
    unique: false,           -- Enforce uniqueness
    nulls_distinct: true     -- Treat NULLs as distinct values
  };
OptionTypeDefaultDescription
fill_factorFloat0.9Page fill percentage (0.5-1.0)
uniqueBooleanfalseEnforce unique constraint
nulls_distinctBooleantrueWhether NULL values are distinct

Supported Operations

OperationSupportedExample
Equality (=)YesWHERE p.age = 30
Range (<, >, <=, >=)YesWHERE p.age > 25
BETWEENYesWHERE p.age BETWEEN 20 AND 40
IS NULLYesWHERE p.age IS NULL
STARTS WITHYesWHERE p.name STARTS WITH 'Al'
ORDER BYYesORDER BY p.age
IN listYesWHERE p.age IN [25, 30, 35]

Performance Characteristics

Operation          | Time Complexity | Notes
-------------------|-----------------|----------------------------------
Point lookup       | O(log N)        | Tree traversal
Range scan         | O(log N + K)    | K = result count
Insert             | O(log N)        | May trigger page split
Delete             | O(log N)        | May trigger page merge
Bulk load          | O(N log N)      | Sequential insert optimization

Examples

-- Range query with ordering
CREATE INDEX person_age_idx ON :Person(age) USING BTREE;

MATCH (p:Person)
WHERE p.age >= 25 AND p.age <= 65
RETURN p.name, p.age
ORDER BY p.age DESC;

-- Prefix search
CREATE INDEX person_name_idx ON :Person(name) USING BTREE;

MATCH (p:Person)
WHERE p.name STARTS WITH 'Alex'
RETURN p.name;

-- Unique constraint
CREATE INDEX person_email_unique ON :Person(email) USING BTREE
  OPTIONS { unique: true };

Hash Index

Hash indexes provide O(1) lookup performance for exact equality queries.

Syntax

CREATE INDEX [index_name] ON :Label(property) USING HASH;

Configuration Options

CREATE INDEX user_id_idx ON :User(id) USING HASH
  OPTIONS {
    bucket_count: 1024,      -- Initial hash buckets
    load_factor: 0.75        -- Resize threshold
  };
OptionTypeDefaultDescription
bucket_countInteger1024Initial number of hash buckets
load_factorFloat0.75Resize when load exceeds this

Supported Operations

OperationSupportedExample
Equality (=)YesWHERE u.email = '[email protected]'
IN listYesWHERE u.id IN [1, 2, 3]
IS NULLYesWHERE u.phone IS NULL
Range (<, >)NoUse B-tree instead
ORDER BYNoUse B-tree instead

Performance Characteristics

Operation          | Time Complexity | Notes
-------------------|-----------------|----------------------------------
Point lookup       | O(1) average    | O(N) worst case with collisions
Insert             | O(1) average    | May trigger rehash
Delete             | O(1) average    | Tombstone-based
Bulk load          | O(N)            | Linear time

Examples

-- High-cardinality equality lookup
CREATE INDEX user_email_idx ON :User(email) USING HASH;

MATCH (u:User)
WHERE u.email = 'alice@example.com'
RETURN u;

-- UUID lookups
CREATE INDEX session_id_idx ON :Session(id) USING HASH;

MATCH (s:Session)
WHERE s.id = '550e8400-e29b-41d4-a716-446655440000'
RETURN s;

When to Use Hash vs B-tree

ScenarioRecommendation
Only equality queriesHash (faster)
Need range queriesB-tree
Need sortingB-tree
High-cardinality unique valuesHash
Low-cardinality (few distinct values)B-tree

Full-text Index

Full-text indexes enable text search with BM25 relevance ranking.

Syntax

CREATE INDEX [index_name] ON :Label(property) USING FULLTEXT;

Configuration Options

CREATE INDEX article_content_idx ON :Article(content) USING FULLTEXT
  OPTIONS {
    language: 'english',     -- Stemming language
    min_word_length: 3,      -- Minimum word length
    stop_words: 'default',   -- Stop word list
    k1: 1.2,                 -- BM25 term frequency saturation
    b: 0.75                  -- BM25 length normalization
  };
OptionTypeDefaultDescription
languageString’english'Stemming/stopwords language
min_word_lengthInteger2Minimum indexed word length
stop_wordsString‘default’Stop word list (‘default’, ’none’, or custom)
k1Float1.2BM25 term frequency saturation
bFloat0.75BM25 length normalization (0-1)

Supported Operations

OperationSupportedExample
fulltext_match()YesWHERE fulltext_match(a.content, 'graph database')
bm25_score()YesRETURN bm25_score(a.content, 'query') AS score
Boolean operatorsYes'graph AND database', 'graph OR network'
Phrase searchYes'"graph database"'
WildcardYes'graph*'
Fuzzy searchYes'grph~' (edit distance 1)

BM25 Scoring

The BM25 algorithm provides relevance ranking:

BM25(d, q) = SUM_i [ IDF(qi) * f(qi,d) * (k1+1) / (f(qi,d) + k1*(1-b+b*|d|/avgdl)) ]

Where:
- f(qi,d) = frequency of term qi in document d
- |d| = document length (word count)
- avgdl = average document length
- k1 = term frequency saturation (higher = more weight to frequency)
- b = length normalization (higher = penalize long documents more)

Examples

-- Basic full-text search
CREATE INDEX doc_content_idx ON :Document(content) USING FULLTEXT;

MATCH (d:Document)
WHERE fulltext_match(d.content, 'graph database performance')
RETURN d.title, bm25_score(d.content, 'graph database performance') AS relevance
ORDER BY relevance DESC
LIMIT 10;

-- Boolean operators
MATCH (d:Document)
WHERE fulltext_match(d.content, 'graph AND (database OR storage) NOT sql')
RETURN d.title;

-- Phrase search
MATCH (d:Document)
WHERE fulltext_match(d.content, '"property graph"')
RETURN d.title;

-- Fuzzy search (typo-tolerant)
MATCH (d:Document)
WHERE fulltext_match(d.content, 'databse~')  -- Matches "database"
RETURN d.title;

Spatial Index

Spatial indexes (R-tree) enable efficient geographic queries.

Syntax

CREATE INDEX [index_name] ON :Label(property) USING SPATIAL;

Configuration Options

CREATE INDEX location_idx ON :Store(location) USING SPATIAL
  OPTIONS {
    srid: 4326,              -- Spatial Reference ID (WGS84)
    precision: 'high'        -- Index precision
  };
OptionTypeDefaultDescription
sridInteger4326Spatial Reference System ID
precisionString‘high’Index precision (’low’, ‘medium’, ‘high’)

Supported Operations

OperationSupportedExample
geo_distance()YesWHERE geo_distance(s.location, point) < 5000
geo_within_box()YesWHERE geo_within_box(s.location, sw, ne)
geo_within_polygon()YesWHERE geo_within_polygon(s.location, polygon)
geo_contains()YesWHERE geo_contains(region, point)
geo_intersects()YesWHERE geo_intersects(area1, area2)

Performance Characteristics

Operation          | Time Complexity | Notes
-------------------|-----------------|----------------------------------
Point-in-region    | O(log N)        | R-tree traversal
k-nearest          | O(log N + K)    | Priority queue search
Range query        | O(log N + K)    | Bounding box filter
Insert             | O(log N)        | May trigger split

Examples

-- Find stores within 5km
CREATE INDEX store_location_idx ON :Store(location) USING SPATIAL;

MATCH (s:Store)
WHERE geo_distance(s.location, latlon(40.7128, -74.0060)) < 5000
RETURN s.name, geo_distance(s.location, latlon(40.7128, -74.0060)) AS meters
ORDER BY meters
LIMIT 10;

-- Bounding box query
MATCH (s:Store)
WHERE geo_within_box(s.location,
                     latlon(40.70, -74.05),   -- Southwest corner
                     latlon(40.75, -73.95))   -- Northeast corner
RETURN s.name;

-- Polygon containment
WITH polygon([
  latlon(40.70, -74.05),
  latlon(40.70, -73.95),
  latlon(40.75, -73.95),
  latlon(40.75, -74.05)
]) AS nyc_area
MATCH (s:Store)
WHERE geo_within_polygon(s.location, nyc_area)
RETURN s.name;

Vector Index (HNSW)

Vector indexes enable approximate nearest neighbor (ANN) search using HNSW (Hierarchical Navigable Small World) algorithm with SIMD acceleration.

Syntax

CREATE INDEX [index_name] ON :Label(property) USING VECTOR;

Configuration Options

CREATE INDEX embedding_idx ON :Document(embedding) USING VECTOR
  OPTIONS {
    dimensions: 512,         -- Vector dimensions (required if not inferred)
    metric: 'l2',            -- Distance metric
    m: 16,                   -- HNSW max connections per layer
    ef_construction: 200,    -- Build-time search width
    ef_search: 100           -- Query-time search width
  };
OptionTypeDefaultDescription
dimensionsInteger(inferred)Vector dimensions
metricString’l2’Distance metric: ’l2’, ‘cosine’, ‘dot’
mInteger16Max connections per node per layer
ef_constructionInteger200Construction search width (accuracy vs speed)
ef_searchInteger100Query search width (accuracy vs speed)

Distance Metrics

MetricFunctionBest For
l2vector_distance_l2()Euclidean distance (general purpose)
cosinevector_distance_cosine()Normalized vectors (text embeddings)
dotvector_distance_dot()Inner product (maximum inner product search)

Performance Characteristics

Operation          | Time Complexity | Notes
-------------------|-----------------|----------------------------------
k-NN search        | O(log N)        | Approximate, SIMD-accelerated
Insert             | O(log N)        | Builds graph structure
Delete             | O(log N)        | Marks as deleted (lazy cleanup)
Build              | O(N log N)      | Initial index construction

HNSW Tuning Guide

ParameterLow ValueHigh Value
mFaster build, less memoryBetter recall, more memory
ef_constructionFaster buildBetter index quality
ef_searchFaster queriesBetter recall

Recommended settings by use case:

Use Casemef_constructionef_search
Low latency810050
Balanced16200100
High recall32400200

Examples

-- Create vector index
CREATE INDEX doc_embedding_idx ON :Document(embedding) USING VECTOR
  OPTIONS { metric: 'cosine', m: 16, ef_search: 100 };

-- k-NN similarity search
MATCH (d:Document)
WITH d, vector_distance_cosine(d.embedding, $query_vector) AS distance
WHERE distance < 0.5
RETURN d.title, distance
ORDER BY distance
LIMIT 10;

-- Find similar documents to a given document
MATCH (source:Document {id: 123})
MATCH (d:Document)
WHERE d.id <> source.id
WITH d, vector_distance_cosine(d.embedding, source.embedding) AS similarity
RETURN d.title, similarity
ORDER BY similarity
LIMIT 5;

-- Hybrid search: vector + filter
MATCH (d:Document)
WHERE d.category = 'technical'
WITH d, vector_distance_l2(d.embedding, $query_vector) AS distance
RETURN d.title, distance
ORDER BY distance
LIMIT 10;

Composite Index

Composite indexes span multiple properties for complex query patterns.

Syntax

CREATE INDEX [index_name] ON :Label(prop1, prop2, ...) USING BTREE;

Configuration Options

Same as B-tree index.

Column Ordering

Important: Column order affects which queries can use the index.

-- Index on (lastname, firstname)
CREATE INDEX person_name_idx ON :Person(lastname, firstname);

-- Can use index:
WHERE p.lastname = 'Smith'                           -- Leading column
WHERE p.lastname = 'Smith' AND p.firstname = 'John' -- Both columns

-- Cannot use index efficiently:
WHERE p.firstname = 'John'                           -- Non-leading column only

Examples

-- Composite index for common query pattern
CREATE INDEX order_customer_date_idx ON :Order(customer_id, order_date);

-- Query using both columns (optimal)
MATCH (o:Order)
WHERE o.customer_id = 123 AND o.order_date >= date('2026-01-01')
RETURN o;

-- Query using leading column only (still uses index)
MATCH (o:Order)
WHERE o.customer_id = 123
RETURN o;

Specialized Indexes

Geode implements additional specialized indexes automatically for certain data types.

JSONB Inverted Index

Automatically created for JSONB properties to support path queries.

-- JSONB path queries use inverted index
MATCH (d:Document)
WHERE d.metadata->'author' = '"Alice"'
RETURN d;

MATCH (d:Document)
WHERE d.metadata->'tags'->0 = '"ml"'
RETURN d;

Interval Tree Index

Automatically used for temporal overlap queries.

-- Interval overlap queries
MATCH (e:Event)
WHERE e.duration OVERLAPS interval('[2026-01-01, 2026-01-31]')
RETURN e;

Patricia Trie Index

Automatically used for IP address prefix queries.

-- IP subnet containment
MATCH (s:Server)
WHERE subnet_contains('10.0.0.0/8'::subnet, s.ip)
RETURN s;

Index Management

Listing Indexes

-- Show all indexes
SHOW INDEXES;

-- Show indexes for a label
SHOW INDEXES ON :Person;

-- Show index details
SHOW INDEX person_email_idx;

Dropping Indexes

-- Drop by name
DROP INDEX person_email_idx;

-- Drop with IF EXISTS
DROP INDEX IF EXISTS person_email_idx;

Index Statistics

-- View index statistics
CALL db.index.stats('person_email_idx');

-- Update index statistics
CALL db.index.analyze('person_email_idx');

-- Rebuild index
CALL db.index.rebuild('person_email_idx');

Best Practices

1. Index Selection Guidelines

Query PatternRecommended Index
Equality on unique valuesHash
Equality on non-unique valuesB-tree or Hash
Range queriesB-tree
Text searchFull-text
Geographic queriesSpatial
Vector similarityVector (HNSW)
Multi-property queriesComposite B-tree

2. Avoid Over-Indexing

  • Each index adds write overhead (INSERT/UPDATE/DELETE)
  • Each index consumes disk space
  • Only index properties used in WHERE, ORDER BY, or JOIN conditions

3. Monitor Index Usage

-- Check if index is being used
EXPLAIN
MATCH (p:Person)
WHERE p.email = 'alice@example.com'
RETURN p;

-- Look for "uses index: <name>" in output

4. Consider Covering Indexes

Include all needed columns to avoid table lookups:

-- Composite index covers the query
CREATE INDEX person_name_email_idx ON :Person(name, email);

-- Query returns indexed columns only (no table lookup)
MATCH (p:Person)
WHERE p.name = 'Alice'
RETURN p.email;

Performance Comparison

Index TypeReadWriteMemoryBest Use Case
B-treeFastFastMediumGeneral purpose
HashFastestFastLowUnique lookups
Full-textFastSlowHighText search
SpatialFastMediumMediumGeographic
VectorFastSlowVery HighML embeddings


Last Updated: January 28, 2026 Geode Version: v0.1.3+ ISO GQL Conformance Profile: ISO/IEC 39075:2024 compliance (see conformance profile)