The Overview and Introduction category provides high-level documentation about Geode, covering its architecture, key features, design philosophy, and how to get started. Whether you’re evaluating Geode for a project, learning about graph databases, or planning a production deployment, this category offers the foundational knowledge you need.
What is Geode?
Geode is a production-ready graph database that aligns with the ISO/IEC 39075:2024 Graph Query Language (GQL) standard via the full GQL conformance profile. Written in Zig for maximum performance and memory safety, Geode combines the power of graph computing with enterprise-grade reliability, achieving 100% GQL compliance and 97.4% test coverage across 1,644 passing tests.
Unlike traditional relational databases that struggle with connected data, or proprietary graph databases that lock you into vendor-specific query languages, Geode provides a standards-based foundation for building modern applications on graph data. The ISO GQL standard ensures your queries are portable, future-proof, and backed by international consensus on graph database best practices.
Core Architecture
Built with Zig
Geode is implemented in Zig, a modern systems programming language that prioritizes safety, performance, and developer ergonomics. This choice delivers several critical advantages:
Memory Safety: Zig’s compile-time checks prevent buffer overflows, use-after-free bugs, and other memory corruption issues that plague C/C++ codebases, ensuring your data remains secure and consistent.
Performance: Zig compiles to highly optimized machine code with zero-cost abstractions, delivering performance comparable to C while maintaining readability and safety. Geode’s query engine, transaction manager, and storage layer all benefit from Zig’s efficiency.
Cross-Platform: Zig’s advanced cross-compilation support makes Geode trivially portable across Linux, macOS, Windows, and even embedded systems without modification.
Explicit Control: Unlike garbage-collected languages, Zig gives Geode precise control over memory allocation and CPU cache usage, critical for database performance under high load.
Standards-Based Query Language
Geode implements ISO/IEC 39075:2024, the first international standard for graph query languages. GQL unifies decades of graph database research into a single, coherent syntax that resembles SQL’s familiarity while embracing graph-native patterns.
-- Standard GQL query
MATCH (person:Person {name: 'Alice'})-[:KNOWS*1..3]->(friend:Person)
WHERE friend.age > 25
RETURN friend.name, friend.occupation
ORDER BY friend.name
This standards alignment means:
- No vendor lock-in: Your queries work across any GQL-compliant database
- Professional support: The ISO committee maintains and evolves the language
- Future-proof: New features arrive through standardized extensions
- Interoperability: GQL integrates smoothly with SQL systems via ISO standards alignment
ACID Transactions
Geode provides full ACID (Atomicity, Consistency, Isolation, Durability) guarantees using Serializable Snapshot Isolation (SSI), the strongest isolation level in database theory. Every transaction sees a consistent snapshot of the database, and concurrent transactions execute as if they ran sequentially.
import geode_client
client = geode_client.open_database("localhost:3141")
async with client.connection() as client:
async with client.connection() as tx:
await tx.begin()
await tx.execute("""
MATCH (account:Account {id: $from})
SET account.balance = account.balance - $amount
""", {"from": 123, "amount": 100.00})
await tx.execute("""
MATCH (account:Account {id: $to})
SET account.balance = account.balance + $amount
""", {"to": 456, "amount": 100.00})
# Both updates commit atomically or both roll back
await tx.commit()
Write-Ahead Logging (WAL) ensures durability by recording all changes to persistent storage before acknowledging commits. Even if the server crashes, committed transactions remain durable.
Multi-Version Concurrency Control (MVCC) enables high-performance reads without blocking writes, and writes without blocking reads. Each transaction sees a consistent snapshot while others modify the database concurrently.
QUIC Transport Protocol
Geode uses QUIC (Quick UDP Internet Connections) as its network protocol instead of traditional TCP. QUIC, developed by Google and standardized as RFC 9000, provides:
- Faster connection establishment: 0-RTT resumption for returning clients
- Better performance on lossy networks: Independent stream recovery
- Built-in encryption: TLS 1.3 mandatory, no plaintext fallback
- Multiplexing without head-of-line blocking: Multiple concurrent queries over one connection
Client libraries automatically handle QUIC complexity, providing simple async/await interfaces:
// Go client example
import "geodedb.com/geode"
ctx := context.Background()
client, err := geode.Connect(ctx, "localhost:3141")
if err != nil {
log.Fatal(err)
}
defer client.Close()
result, err := client.Query(ctx, "MATCH (n:Node) RETURN count(n)")
Key Features
Enterprise Security
Geode provides defense-in-depth security suitable for regulated industries:
- TLS 1.3 mandatory: All network communication encrypted by default
- Row-Level Security (RLS): Fine-grained access control at the data level
- Transparent Data Encryption (TDE): Encryption at rest for all database files
- Field-Level Encryption (FLE): Encrypt sensitive fields with application-controlled keys
- Audit logging: Comprehensive tracking of all database operations
Advanced Analytics
Geode excels at analytical workloads traditionally challenging for databases:
- Graph algorithms: PageRank, community detection, shortest paths, centrality measures
- Vector search: HNSW indexing for semantic similarity and embeddings
- Full-text search: BM25 ranking with Unicode normalization and language-aware stemming
- Real-time analytics: MVCC enables queries on live data without blocking transactions
- Aggregations: Statistical functions, grouping, windowing, and custom aggregates
Polyglot Client Libraries
Geode provides official client libraries for multiple languages, all following idiomatic patterns for their ecosystem:
- Go:
database/sqldriver with connection pooling - Python: Async client with type hints and modern asyncio patterns
- Rust: Tokio-based async with zero-cost abstractions
- Zig: Native client demonstrating Geode’s internal APIs
Each client handles protocol complexity, connection management, and error handling transparently.
Getting Started
Installation
Geode provides multiple installation options:
# From source (requires Zig 0.1.0+)
git clone https://github.com/codeprosorg/geode
cd geode
make build
# Using Docker
docker pull codepros/geode:latest
docker run -p 3141:3141 codepros/geode:latest
# Using package managers
brew install geodedb/geode/geode # macOS
# apt install geode # Debian/Ubuntu
First Steps
After installation, start the server and connect with the interactive shell:
# Start server
geode serve --listen 0.0.0.0:3141
# In another terminal, connect with shell
geode shell
# Create your first graph
geode> CREATE (alice:Person {name: 'Alice', age: 30})
geode> CREATE (bob:Person {name: 'Bob', age: 25})
geode> MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
CREATE (a)-[:KNOWS {since: 2020}]->(b)
geode> MATCH (a:Person)-[r:KNOWS]->(b:Person)
RETURN a.name, b.name, r.since
Learning Path
- Understand graph concepts: Learn nodes, relationships, properties, and labels
- Master GQL basics: Pattern matching, filtering, returning results
- Explore data modeling: Design effective graph schemas for your domain
- Add complexity: Transactions, constraints, indexes, optimization
- Deploy to production: Security, monitoring, backup, scaling
When to Use Geode
Geode excels in scenarios where relationships between entities are first-class citizens:
Social Networks: Model users, posts, comments, likes, follows, and groups naturally as graphs. Query friend-of-friend connections, recommend content, detect communities.
Fraud Detection: Identify fraud rings, money laundering patterns, and coordinated attacks by analyzing transaction networks and relationship patterns invisible in relational databases.
Knowledge Graphs: Build semantic networks connecting entities, concepts, and facts. Power intelligent search, question answering, and recommendation systems.
Network Infrastructure: Model physical networks (telecom, utilities, transportation) with nodes as locations and edges as connections. Optimize routing, identify bottlenecks, plan capacity.
Access Control: Implement complex authorization with hierarchical roles, delegated permissions, and attribute-based policies expressed as graph traversals.
Recommendation Engines: Collaborative filtering, content similarity, and hybrid approaches all leverage graph structure to suggest relevant items.
Architecture Characteristics
Geode’s architecture is designed for graph workloads:
- Storage: Memory-mapped I/O with page-level caching
- Concurrency: SSI isolation with MVCC enables high read and write parallelism
- Indexes: Six specialized index types for different access patterns
- Memory efficiency: Zig’s explicit allocators minimize overhead and fragmentation
Performance tuning options include indexes, query optimization, connection pooling, and prepared statements.
Community and Support
Geode is developed by CodePros with an active community of contributors:
- GitLab: Source code, issues, and merge requests
- Documentation: Comprehensive guides, tutorials, and API references
- Professional Support: Commercial support available for production deployments
- Contributing: Open to community contributions following evidence-based development practices
Related Topics
- Getting Started - Installation and first steps
- Architecture - Detailed system design
- GQL Reference - Complete query language documentation
- Client Libraries - Language-specific integration guides
- Security - Enterprise security features
- Performance - Optimization and tuning
- Use Cases - Real-world applications
Technical Deep Dive
Storage Architecture
Geode’s storage layer implements a custom graph-optimized format designed for efficient traversal and updates.
Property Graph Storage Model
Unlike relational databases that decompose graphs into multiple tables with expensive joins, Geode stores nodes and relationships in adjacency structures that mirror the graph’s natural topology. Each node maintains direct pointers to its relationships, enabling constant-time traversal to neighbors.
Node Structure:
- Node ID (8 bytes)
- Labels bitmap (variable)
- Property map offset (8 bytes)
- Incoming edges list offset (8 bytes)
- Outgoing edges list offset (8 bytes)
Relationship Structure:
- Relationship ID (8 bytes)
- Type ID (4 bytes)
- Source node ID (8 bytes)
- Target node ID (8 bytes)
- Property map offset (8 bytes)
This layout enables efficient pattern matching: traversing from a node to its neighbors requires following a single pointer, not joining tables.
Write-Ahead Log (WAL)
Every transaction is recorded to a Write-Ahead Log before modifying the database. The WAL guarantees durability: even if the server crashes mid-transaction, committed changes can be replayed from the log on restart.
WAL Entry Structure:
[Transaction ID | Timestamp | Operation Type | Data | Checksum]
WAL entries are written sequentially, maximizing disk throughput. Background processes periodically checkpoint the database, allowing old WAL segments to be archived or deleted.
MVCC Implementation
Multi-Version Concurrency Control maintains multiple versions of data, allowing transactions to see consistent snapshots without blocking each other. When a transaction modifies a node, Geode creates a new version rather than overwriting:
Version Chain:
v3 (current) -> v2 (committed) -> v1 (committed) -> v0 (initial)
Each transaction sees versions committed before it started. Obsolete versions are garbage collected after all referencing transactions complete.
Query Execution Engine
Geode’s query engine transforms GQL into optimized execution plans.
Query Pipeline
- Lexing: Break query text into tokens
- Parsing: Build Abstract Syntax Tree (AST)
- Semantic Analysis: Validate labels, properties, types
- Logical Planning: Convert AST to logical operators
- Optimization: Apply rewrite rules and cost-based optimization
- Physical Planning: Select algorithms and access methods
- Execution: Run plan and stream results
Cost-Based Optimization
The optimizer estimates costs for different execution strategies using statistics about data distribution:
-- Optimizer chooses index scan over full scan
MATCH (u:User {email: $email})
RETURN u
-- Index Scan on User.email: Cost 1.5
-- Full Scan on User: Cost 150,000
-- Decision: Use index
Statistics include node counts per label, property cardinality, and relationship counts per type. The optimizer refreshes statistics periodically or on-demand.
Parallel Execution
Geode parallelizes query execution across CPU cores:
- Partition parallelism: Scan different graph regions concurrently
- Pipeline parallelism: Execute different query stages simultaneously
- Operator parallelism: Parallelize operations like hash joins and aggregations
Network Protocol Details
QUIC Advantages Over TCP
Traditional databases use TCP, which suffers from head-of-line blocking: a lost packet stalls all multiplexed streams. QUIC solves this with independent stream recovery. Each query runs on its own stream; packet loss on one stream doesn’t delay others.
Connection Establishment
Client Server
| |
|--- Initial (HELLO) ---------->|
|<-- Handshake (cert) ----------|
|--- Handshake (finished) ----->|
|<-- 1-RTT Ready ---------------|
| |
|--- RUN_GQL ------------------>|
|<-- BINDINGS ------------------|
First connection: 1-RTT handshake Returning connection: 0-RTT resumption (no handshake delay)
JSON Line Protocol
Each message is a single-line JSON object:
{"type": "RUN_GQL", "query": "MATCH (n) RETURN n", "params": {}}
{"type": "SCHEMA", "fields": [{"name": "n", "type": "node"}]}
{"type": "BINDINGS", "row": [{"id": 1, "labels": ["Person"], "props": {...}}]}
{"type": "DONE"}
Line-delimited format enables streaming: clients process rows as they arrive rather than buffering entire result sets.
Deployment Scenarios
Development Deployment
Docker for Local Development
# Pull official image
docker pull codepros/geode:v0.1.3
# Run with persistent storage
docker run -d \
--name geode-dev \
-p 3141:3141 \
-v geode-data:/var/lib/geode \
-e GEODE_LOG_LEVEL=debug \
codepros/geode:v0.1.3
# Connect and develop
docker exec -it geode-dev geode shell
Configuration for Development
# geode.yaml
server:
listen: "0.0.0.0:3141"
tls:
enabled: true
cert: "/etc/geode/cert.pem"
key: "/etc/geode/key.pem"
storage:
data_dir: "/var/lib/geode/data"
wal_dir: "/var/lib/geode/wal"
logging:
level: "debug"
output: "stdout"
performance:
query_timeout_ms: 30000
max_concurrent_queries: 100
Production Deployment
Kubernetes StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: geode
spec:
serviceName: geode
replicas: 3
selector:
matchLabels:
app: geode
template:
metadata:
labels:
app: geode
spec:
containers:
- name: geode
image: codepros/geode:v0.1.3
ports:
- containerPort: 3141
name: quic
volumeMounts:
- name: data
mountPath: /var/lib/geode
resources:
requests:
memory: "8Gi"
cpu: "2000m"
limits:
memory: "16Gi"
cpu: "4000m"
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
High Availability Configuration
# geode-ha.yaml
cluster:
mode: "distributed"
node_id: "node-1"
peers:
- "node-2.geode.svc.cluster.local:3141"
- "node-3.geode.svc.cluster.local:3141"
replication:
factor: 3
sync: true
failover:
enabled: true
health_check_interval_ms: 5000
leader_election_timeout_ms: 10000
Data Modeling Principles
Graph Schema Design
Unlike relational schemas with rigid table structures, graph schemas are flexible. However, good design principles still apply:
Entity as Node, Relationship as Edge
Model domain entities (people, products, locations) as nodes. Model connections (knows, purchased, located_in) as relationships.
-- Good: Clear entity/relationship distinction
(alice:Person)-[:PURCHASED]->(laptop:Product)
-- Avoid: Relationships as nodes create extra traversals
(alice:Person)-[:HAS_ORDER]->(order:Order)-[:CONTAINS]->(laptop:Product)
-- (Use this pattern only if Order has important properties)
Property Placement
Properties belong on the entity they describe:
-- Good: Properties on correct entity
(person:Person {name: 'Alice', age: 30})-[:WORKS_AT {since: 2020}]->(company:Company {name: 'Acme'})
-- Avoid: Mixing concerns
(person:Person {name: 'Alice', company_name: 'Acme'})
-- (Breaks normalization; updates to company require finding all employees)
Label Strategy
Use labels for categorization and polymorphism:
-- Multiple labels for role-based modeling
CREATE (doc:Document:Confidential:Audited {title: 'Q4 Results'})
-- Query specific subtypes
MATCH (d:Document:Confidential)
RETURN d.title
Anti-Patterns to Avoid
Anti-Pattern: Dense Nodes
Nodes with millions of relationships (super nodes) create performance bottlenecks:
-- Avoid: One "Users" node with millions of edges
(users:Users)-[:CONTAINS]->(alice:Person)
(users:Users)-[:CONTAINS]->(bob:Person)
-- Traversing from :Users is slow
-- Better: Direct node access
MATCH (p:Person {email: $email})
-- Uses index, constant time
Anti-Pattern: Relationship Properties as Nodes
-- Avoid: Unnecessary intermediate nodes
(person)-[:HAS_RATING]->(rating:Rating {value: 5})-[:FOR_PRODUCT]->(product)
-- Better: Property on relationship
(person)-[:RATED {score: 5, timestamp: ...}]->(product)
Ecosystem and Integrations
Data Integration
ETL from Relational Databases
import psycopg2
import geode_client
async def migrate_relational_to_graph():
"""Migrate PostgreSQL data to Geode."""
# Extract from PostgreSQL
pg_conn = psycopg2.connect("dbname=mydb")
pg_cursor = pg_conn.cursor()
pg_cursor.execute("SELECT id, name, email FROM users")
# Load into Geode
client = geode_client.open_database("localhost:3141")
async with client.connection() as geode:
for user_id, name, email in pg_cursor:
await geode.execute("""
CREATE (:User {
id: $id,
name: $name,
email: $email
})
""", {"id": user_id, "name": name, "email": email})
pg_cursor.close()
pg_conn.close()
Monitoring and Observability
Prometheus Metrics
# Scrape Geode metrics endpoint
curl http://localhost:9090/metrics
# Sample output:
# geode_queries_total{status="success"} 15432
# geode_query_duration_seconds_sum 23.45
# geode_active_connections 12
# geode_wal_size_bytes 1048576
Grafana Dashboards
Pre-built dashboards visualize:
- Query throughput and latency
- Transaction commit/abort rates
- Index usage statistics
- Memory and disk utilization
- Connection pool health
Application Frameworks
GraphQL Integration
// Express + Apollo GraphQL + Geode
const { ApolloServer, gql } = require('apollo-server-express');
const { createClient } = require('@geodedb/client');
const geode = createClient('quic://localhost:3141');
const typeDefs = gql`
type Person {
name: String!
friends: [Person]
}
type Query {
person(name: String!): Person
}
`;
const resolvers = {
Query: {
person: async (_, { name }) => {
const client = await geode;
const rows = await client.queryAll(
'MATCH (p:Person {name: $name}) RETURN p',
{ params: { name } }
);
return rows[0]?.p;
},
},
Person: {
friends: async (person) => {
const result = await geodeClient.execute(
'MATCH (p:Person {id: $id})-[:KNOWS]->(f) RETURN f',
{ id: person.id }
);
return result.rows.map(r => r.f);
},
},
};
Comparing Geode to Alternatives
vs. Relational Databases
Strengths of Geode:
- Native graph traversals (no joins)
- Flexible schema evolution
- Natural modeling of connected data
When to use relational:
- Tabular data with few relationships
- Complex aggregations over flat data
- Regulatory requirements for SQL
vs. Proprietary Graph Databases
Geode Advantages:
- ISO standard query language (no vendor lock-in)
- Apache 2.0 license (fully open source)
- Modern architecture (QUIC, Zig)
- Smaller resource footprint
Competitor Advantages:
- Larger ecosystems and community
- More mature visualization tools
- Enterprise support contracts
Roadmap and Future Development
Upcoming in v0.1.4:
- Distributed mode with automatic sharding
- Improved query optimizer with machine learning
- Additional graph algorithm library
- Enhanced monitoring dashboards
Planned for v1.0.0:
- API stability guarantees
- Extended long-term support
- Enterprise certifications
- Additional client language support (Java, C#, JavaScript)
Contributing to Geode
Geode welcomes contributions following evidence-based development:
- Issues: Report bugs or request features on GitLab
- Testing: All code must include tests
- Documentation: Update docs for new features
- Code Review: All changes reviewed before merge
- CANARY Markers: Implementation requires governance markers
Further Reading
- ISO GQL Standard - Graph Query Language specification
- ACID Transactions - Transaction guarantees
- MVCC - Concurrency control architecture
- QUIC Protocol - Modern network transport
- Storage Engine - Storage architecture details
- Query Optimization - Optimization internals
- Schema Design - Schema design patterns
- Migration Guide - Moving from other databases
- Deployment Patterns - Deployment guide