Geode documentation tagged with Graph Query Language (GQL). GQL is the ISO/IEC 39075:2024 international standard for querying property graph databases, providing a declarative, composable language designed specifically for graph data structures.
Introduction to GQL
The Graph Query Language (GQL) represents a watershed moment in database standardization. Published as ISO/IEC 39075:2024, GQL provides the first international standard specifically designed for querying property graphs. Unlike SQL, which evolved from relational theory, or Cypher and SPARQL, which were vendor-specific or domain-specific, GQL was built from the ground up as a universal standard for graph databases.
GQL combines the best ideas from existing graph query languages while introducing new capabilities for modern graph workloads. The language supports both read queries (MATCH) and write operations (INSERT, SET, REMOVE, DELETE), making it a complete data manipulation language. Its declarative nature means you describe what data you want, not how to retrieve it—the query optimizer handles execution strategy.
Geode follows the ISO/IEC 39075:2024 compliance. See the conformance profile for scope, diagnostics, and implementation-defined behaviors.
Key Concepts
Pattern Matching
At the heart of GQL is pattern matching. The MATCH clause lets you describe graph structures using ASCII-art syntax:
MATCH (person:Person)-[:KNOWS]->(friend:Person)
WHERE person.name = 'Alice'
RETURN friend.name
This query reads naturally: “Match a Person node named Alice, connected by a KNOWS relationship to friend nodes, and return the friends’ names.”
Node and Relationship Patterns
GQL supports rich pattern syntax:
- Node patterns:
(variable:Label {property: value}) - Relationship patterns:
-[:TYPE]->,<-[:TYPE]-,-[:TYPE]- - Variable-length paths:
-[:KNOWS*1..3]->(1 to 3 hops) - Property constraints:
WHERE node.age > 30
Data Manipulation
GQL provides comprehensive write operations:
// Insert new nodes
INSERT (:Person {name: 'Bob', age: 30})
// Create relationships
MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
INSERT (a)-[:KNOWS]->(b)
// Update properties
MATCH (p:Person {name: 'Bob'})
SET p.age = 31
// Remove properties
MATCH (p:Person {name: 'Bob'})
REMOVE p.temporary_flag
// Delete nodes and relationships
MATCH (p:Person {name: 'Bob'})
DELETE p
Composability and Modularity
GQL queries compose naturally using subqueries, CTEs (Common Table Expressions), and nested patterns:
MATCH (p:Person)
WHERE EXISTS {
MATCH (p)-[:WORKS_AT]->(:Company {name: 'Acme Corp'})
}
RETURN p.name
How GQL Works in Geode
Geode’s GQL implementation is built on a sophisticated query engine that transforms GQL syntax into optimized execution plans.
Query Pipeline
- Lexical Analysis: The lexer tokenizes GQL source code into tokens (keywords, identifiers, operators)
- Parsing: The parser builds an Abstract Syntax Tree (AST) representing the query structure
- Semantic Analysis: Type checking, label validation, and constraint verification
- Query Optimization: The optimizer applies rules to generate efficient execution plans
- Execution: The runtime engine executes the plan using Geode’s storage and indexing layers
Standards Compliance
Geode achieves ISO/IEC 39075:2024 compliance through:
- Complete syntax support: All GQL keywords, operators, and expressions
- Correct semantics: Exact ISO-specified behavior for edge cases
- Test validation: All 70 ISO compliance tests passing
- Error handling: ISO-compliant status codes and error messages
Performance Optimizations
Geode’s GQL engine includes advanced optimizations:
- Index-aware planning: Automatically uses indexes for WHERE clauses
- Join reordering: Optimizes multi-pattern queries
- Predicate pushdown: Filters data as early as possible
- Parallel execution: Distributes work across CPU cores
- Query caching: Reuses compiled query plans
Use Cases
Social Network Analysis
// Find friends-of-friends who share interests
MATCH (me:Person {id: $userId})-[:KNOWS]->()-[:KNOWS]->(foaf:Person)
WHERE NOT EXISTS {
MATCH (me)-[:KNOWS]->(foaf)
}
AND EXISTS {
MATCH (me)-[:INTERESTED_IN]->(interest)<-[:INTERESTED_IN]-(foaf)
}
RETURN DISTINCT foaf.name, COUNT(interest) AS shared_interests
ORDER BY shared_interests DESC
LIMIT 10
Fraud Detection
// Detect suspicious transaction patterns
MATCH (account:Account)-[t1:TRANSACTION]->(intermediary:Account)
-[t2:TRANSACTION]->(destination:Account)
WHERE t1.timestamp - t2.timestamp < duration('PT1H')
AND t1.amount > 10000
AND account.risk_score > 0.7
RETURN account.id, destination.id, t1.amount
Knowledge Graph Queries
// Find related concepts within 3 hops
MATCH path = (start:Concept {name: 'Machine Learning'})
-[:RELATED_TO*1..3]->(related:Concept)
WHERE related.importance > 0.5
RETURN related.name, length(path) AS distance
ORDER BY distance, related.importance DESC
Recommendation Systems
// Collaborative filtering recommendations
MATCH (user:User {id: $userId})-[:RATED]->(item:Item)
<-[:RATED]-(similar:User)-[:RATED]->(recommendation:Item)
WHERE NOT EXISTS {
MATCH (user)-[:RATED]->(recommendation)
}
RETURN recommendation.title, COUNT(similar) AS score
ORDER BY score DESC
LIMIT 20
Best Practices
Query Optimization
Use indexes: Create indexes on frequently queried properties
CREATE INDEX person_name FOR (p:Person) ON (p.name)Filter early: Place selective WHERE clauses near MATCH patterns
Limit results: Always use LIMIT for unbounded queries
Profile queries: Use PROFILE to understand execution plans
Schema Design
- Label consistently: Use clear, singular noun labels (
:Person,:Product) - Type relationships: Give relationships meaningful types (
:PURCHASED,:FRIEND_OF) - Index strategically: Index properties used in WHERE, JOIN, and ORDER BY
- Normalize carefully: Balance between normalization and query performance
Transaction Management
BEGIN TRANSACTION;
MATCH (account:Account {id: $fromId})
WHERE account.balance >= $amount
SET account.balance = account.balance - $amount;
MATCH (recipient:Account {id: $toId})
SET recipient.balance = recipient.balance + $amount;
COMMIT;
Error Handling
- Check for NULL values in optional patterns
- Use EXISTS for conditional logic
- Validate inputs in WHERE clauses
- Handle constraint violations gracefully
Code Examples
Basic Pattern Matching
// Find all employees in engineering department
MATCH (p:Person)-[:WORKS_IN]->(d:Department {name: 'Engineering'})
RETURN p.name, p.title
ORDER BY p.hire_date DESC
Aggregation
// Count connections by relationship type
MATCH (p:Person)-[r]->()
RETURN type(r) AS relationship_type, COUNT(*) AS count
ORDER BY count DESC
Path Queries
// Find shortest path between two people
MATCH path = shortestPath(
(start:Person {name: 'Alice'})-[:KNOWS*]-(end:Person {name: 'Bob'})
)
RETURN [node IN nodes(path) | node.name] AS path
Conditional Logic
// Use CASE for computed values
MATCH (p:Person)
RETURN p.name,
CASE
WHEN p.age < 18 THEN 'Minor'
WHEN p.age < 65 THEN 'Adult'
ELSE 'Senior'
END AS age_group
Related Topics
Explore related GQL concepts and features:
- ISO GQL Standard - ISO/IEC 39075:2024 specification details
- GQL Compliance - Standards conformance and testing
- Pattern Matching - Advanced pattern techniques
- MATCH Clause - Core query patterns
- Query Language - Language fundamentals
- Query Optimization - Performance tuning
- GQL Syntax - Syntax reference
- GQL Reference - Complete language reference
Further Reading
Documentation
- GQL Quick Reference - Syntax cheat sheet
- API Reference - Complete API documentation
- Query Optimization - Performance best practices
- GQL Tutorial Series - Step-by-step learning path
Specifications
- ISO/IEC 39075:2024 Overview - International standard details
- GQL Specification - Technical specification
Advanced Topics
- Transaction Management - ACID guarantees and isolation
- Distributed Queries - Multi-node query execution
- Security - Row-level security and query authorization
- Performance - Benchmarks and optimization
Advanced Query Patterns
Complex Traversals with Multiple Hops
Real-world graph applications often require sophisticated multi-hop traversals with conditional logic:
// Find influencers in a social network
MATCH (u:User)
WHERE EXISTS {
MATCH (u)<-[:FOLLOWS]-(follower)
WITH follower
WHERE follower.verified = true
WITH count(follower) AS verified_followers
WHERE verified_followers > 1000
}
AND EXISTS {
MATCH (u)-[:POSTED]->(p:Post)
WHERE p.likes > 10000
WITH count(p) AS viral_posts
WHERE viral_posts > 5
}
RETURN u.name,
SIZE((u)<-[:FOLLOWS]-()) AS total_followers,
SIZE((u)-[:POSTED]->(:Post)) AS post_count
ORDER BY total_followers DESC
LIMIT 50
Recursive Pattern Matching
GQL supports recursive patterns for hierarchical data:
// Organizational hierarchy traversal
MATCH path = (employee:Person {id: $employeeId})
-[:REPORTS_TO*0..]->(manager:Person)
WHERE manager.role = 'CEO'
RETURN [n IN nodes(path) | {
name: n.name,
title: n.title,
level: length(path) - indexOf(nodes(path), n)
}] AS hierarchy
Graph Algorithms with GQL
Implement graph algorithms using declarative GQL:
// PageRank-style influence calculation
MATCH (p:Page)
WITH p, SIZE((p)<-[:LINKS_TO]-()) AS inbound_links
WITH p, inbound_links,
1.0 + 0.85 * SUM(
CASE WHEN inbound_links > 0
THEN 1.0 / inbound_links
ELSE 0
END
) AS page_rank
WHERE page_rank > 0.5
RETURN p.url, page_rank
ORDER BY page_rank DESC
LIMIT 100
Performance Optimization Deep Dive
Index Strategy
Choosing the right indexes dramatically impacts query performance:
// Multi-column composite index for complex queries
CREATE INDEX user_activity_idx
FOR (u:User)
ON (u.country, u.status, u.created_at);
// Covering index includes all needed properties
CREATE INDEX product_search_idx
FOR (p:Product)
ON (p.category, p.price, p.availability)
INCLUDE (p.name, p.description);
// Full-text search index for text queries
CREATE TEXT INDEX article_content_idx
FOR (a:Article)
ON (a.title, a.body)
OPTIONS {
analyzer: 'english',
min_token_length: 3
};
Query Rewriting for Performance
Transform queries to leverage indexes effectively:
-- SLOW: Property filter in WHERE clause
MATCH (u:User)-[:PURCHASED]->(p:Product)
WHERE u.country = 'USA'
RETURN p.name, count(u) AS buyers;
-- FAST: Property filter in MATCH pattern
MATCH (u:User {country: 'USA'})-[:PURCHASED]->(p:Product)
RETURN p.name, count(u) AS buyers;
-- EVEN FASTER: Use index hints
MATCH (u:User {country: 'USA'})-[:PURCHASED]->(p:Product)
USING INDEX user_activity_idx
RETURN p.name, count(u) AS buyers;
Batch Operations
Optimize bulk data operations:
// Batch insert with UNWIND
UNWIND $batch AS item
CREATE (p:Product {
id: item.id,
name: item.name,
price: item.price,
created_at: timestamp()
});
// Batch update with FOREACH
MATCH (u:User)
WHERE u.id IN $user_ids
FOREACH (update IN $updates |
MERGE (u {id: update.id})
SET u.last_login = update.timestamp
);
// Parallel batch processing
CALL {
UNWIND range(0, $total_batches - 1) AS batch_num
WITH batch_num, batch_num * $batch_size AS offset
MATCH (n:Node)
SKIP offset
LIMIT $batch_size
SET n.processed = true
} IN TRANSACTIONS OF 1000 ROWS;
Advanced Features and Extensions
Temporal Queries
GQL supports temporal data types and operations:
// Time-based filtering
MATCH (event:Event)
WHERE event.timestamp >= datetime('2025-01-01T00:00:00Z')
AND event.timestamp < datetime('2025-02-01T00:00:00Z')
RETURN event.type, count(*) AS occurrences
ORDER BY occurrences DESC;
// Duration calculations
MATCH (session:Session)
WITH session,
duration.between(session.start_time, session.end_time) AS session_length
WHERE session_length > duration('PT30M')
RETURN session.user_id, session_length
ORDER BY session_length DESC;
// Date arithmetic
MATCH (subscription:Subscription)
WHERE subscription.expires_at < date() + duration('P7D')
AND subscription.auto_renew = false
RETURN subscription.user_email, subscription.expires_at
ORDER BY subscription.expires_at;
Geospatial Queries
Geode extends GQL with spatial operations:
// Point-based proximity search
MATCH (store:Store)
WHERE point.distance(
store.location,
point({latitude: $user_lat, longitude: $user_lon})
) < 5000 // meters
RETURN store.name,
store.address,
point.distance(store.location,
point({latitude: $user_lat, longitude: $user_lon})) AS distance
ORDER BY distance
LIMIT 10;
// Polygon containment
MATCH (property:Property)
WHERE point.within(
property.location,
polygon($neighborhood_boundary)
)
RETURN property.address, property.price;
Vector Similarity Search
Combine graph traversal with vector search:
// Semantic search over graph structures
MATCH (doc:Document)
WHERE vector.cosineSimilarity(doc.embedding, $query_embedding) > 0.8
OPTIONAL MATCH (doc)-[:CITES]->(cited:Document)
RETURN doc.title,
doc.abstract,
vector.cosineSimilarity(doc.embedding, $query_embedding) AS similarity,
collect(cited.title) AS citations
ORDER BY similarity DESC
LIMIT 20;
// Hybrid search combining keywords and vectors
MATCH (article:Article)
WHERE article.content SEARCH $keywords
WITH article, SCORE(article) AS text_score
WHERE vector.cosineSimilarity(article.embedding, $query_vector) > 0.7
RETURN article.title,
text_score,
vector.cosineSimilarity(article.embedding, $query_vector) AS semantic_score,
text_score * 0.4 + semantic_score * 0.6 AS combined_score
ORDER BY combined_score DESC;
Client Library Integration
Python Advanced Usage
from geode_client import Client, QueryBuilder
import asyncio
async def advanced_query_patterns():
client = Client(host="localhost", port=3141)
async with client.connection() as conn:
# Prepared statement caching
stmt = await conn.prepare("""
MATCH (u:User {id: $user_id})-[:FRIEND*1..2]->(friend)
RETURN DISTINCT friend.name, friend.email
""")
# Execute with different parameters
for user_id in range(1, 100):
result, _ = await stmt.execute(user_id=user_id)
friends = [row for row in result.rows]
print(f"User {user_id} has {len(friends)} friends")
# Transaction with savepoints
async with client.connection() as txn:
await txn.begin()
await txn.execute("CREATE (u:User {id: 1000, name: 'Alice'})")
await txn.savepoint("after_user")
try:
await txn.execute("""
MATCH (u:User {id: 1000})
CREATE (u)-[:POSTED]->(p:Post {title: 'Hello'})
""")
except Exception as e:
await txn.rollback_to("after_user")
print(f"Post creation failed: {e}")
# Transaction commits automatically on context exit
# Streaming large result sets
query = """
MATCH (n:Node)
RETURN n.id, n.data
ORDER BY n.created_at
"""
async with client.stream(query) as stream:
batch = []
for row in stream.rows:
batch.append(row)
if len(batch) >= 1000:
await process_batch(batch)
batch = []
if batch:
await process_batch(batch)
async def process_batch(batch):
# Process batch of results
pass
asyncio.run(advanced_query_patterns())
Go Advanced Usage
package main
import (
"context"
"database/sql"
"fmt"
"time"
_ "geodedb.com/geode"
)
func advancedQueries() error {
db, err := sql.Open("geode", "quic://localhost:3141")
if err != nil {
return err
}
defer db.Close()
// Connection pooling configuration
db.SetMaxOpenConns(25)
db.SetMaxIdleConns(10)
db.SetConnMaxLifetime(5 * time.Minute)
ctx := context.Background()
// Prepared statements
stmt, err := db.PrepareContext(ctx, `
MATCH (u:User {id: $1})-[:PURCHASED]->(p:Product)
WHERE p.price > $2
RETURN p.name, p.price
ORDER BY p.price DESC
`)
if err != nil {
return err
}
defer stmt.Close()
// Execute with different parameters
rows, err := stmt.QueryContext(ctx, 123, 100.0)
if err != nil {
return err
}
defer rows.Close()
for rows.Next() {
var name string
var price float64
if err := rows.Scan(&name, &price); err != nil {
return err
}
fmt.Printf("%s: $%.2f\n", name, price)
}
// Transaction with retry logic
return withRetry(ctx, db, func(tx *sql.Tx) error {
_, err := tx.ExecContext(ctx, `
MATCH (account:Account {id: $1})
WHERE account.balance >= $2
SET account.balance = account.balance - $2
`, "acc_123", 50.0)
return err
})
}
func withRetry(ctx context.Context, db *sql.DB, fn func(*sql.Tx) error) error {
maxRetries := 3
for attempt := 0; attempt < maxRetries; attempt++ {
tx, err := db.BeginTx(ctx, nil)
if err != nil {
return err
}
err = fn(tx)
if err != nil {
tx.Rollback()
if isSerializationError(err) && attempt < maxRetries-1 {
time.Sleep(time.Millisecond * time.Duration(1<<attempt))
continue
}
return err
}
return tx.Commit()
}
return fmt.Errorf("max retries exceeded")
}
func isSerializationError(err error) bool {
// Check for serialization error
return false // Implement based on error codes
}
Troubleshooting Common Issues
Query Performance Problems
Symptom: Slow query execution
Diagnosis:
-- Use EXPLAIN to see query plan
EXPLAIN
MATCH (u:User)-[:FRIEND*1..3]->(friend)
WHERE u.country = 'USA'
RETURN friend.name;
-- Use PROFILE for detailed metrics
PROFILE
MATCH (u:User)-[:FRIEND*1..3]->(friend)
WHERE u.country = 'USA'
RETURN friend.name;
Solutions:
- Add indexes on frequently filtered properties
- Rewrite variable-length patterns with LIMIT
- Break complex queries into smaller pieces with WITH
- Use OPTIONAL MATCH instead of EXISTS when appropriate
Memory Issues with Large Result Sets
Symptom: Out of memory errors
Solutions:
-- Add LIMIT to bound results
MATCH (n:Node)
RETURN n
LIMIT 10000;
-- Use aggregation to reduce result size
MATCH (u:User)-[:PURCHASED]->(p:Product)
RETURN p.category, count(u) AS buyers
ORDER BY buyers DESC;
-- Process in batches
MATCH (n:Node)
WITH n
SKIP $offset
LIMIT $batch_size
RETURN n;
Deadlock and Conflict Errors
Symptom: Transaction aborted due to conflicts
Solutions:
- Implement retry logic with exponential backoff
- Access resources in consistent order
- Keep transactions short
- Use optimistic locking patterns
from geode_client import QueryError
async def retry_on_conflict(client, query, params, max_attempts=3):
for attempt in range(max_attempts):
async with client.connection() as conn:
await conn.begin()
try:
page, _ = await conn.query(query, params)
await conn.commit()
return page
except QueryError as exc:
await conn.rollback()
if "40502" not in str(exc):
raise
if attempt == max_attempts - 1:
raise
await asyncio.sleep(0.1 * (2 ** attempt))
Production Deployment Best Practices
Monitoring and Observability
# Export query metrics
from prometheus_client import Counter, Histogram
query_duration = Histogram(
'geode_query_duration_seconds',
'Query execution time',
['query_type']
)
query_errors = Counter(
'geode_query_errors_total',
'Query error count',
['error_type']
)
async def monitored_query(client, query, params):
start = time.time()
try:
result, _ = await client.query(query, params)
query_duration.labels(query_type='read').observe(time.time() - start)
return result
except Exception as e:
query_errors.labels(error_type=type(e).__name__).inc()
raise
Query Caching Strategy
from functools import lru_cache
import hashlib
class QueryCache:
def __init__(self, max_size=1000, ttl=300):
self.cache = {}
self.max_size = max_size
self.ttl = ttl
def get_cache_key(self, query, params):
return hashlib.md5(
f"{query}:{params}".encode()
).hexdigest()
async def execute(self, client, query, params):
cache_key = self.get_cache_key(query, params)
if cache_key in self.cache:
entry = self.cache[cache_key]
if time.time() - entry['timestamp'] < self.ttl:
return entry['result']
result, _ = await client.query(query, params)
self.cache[cache_key] = {
'result': result,
'timestamp': time.time()
}
return result
Load Balancing and Failover
class GeodeCluster:
def __init__(self, nodes):
self.nodes = nodes
self.current_index = 0
async def get_client(self):
"""Round-robin load balancing."""
for _ in range(len(self.nodes)):
node = self.nodes[self.current_index]
self.current_index = (self.current_index + 1) % len(self.nodes)
try:
client = Client(host=node['host'], port=node['port'])
conn = await client.connect()
try:
await conn.query("RETURN 1")
return conn
except Exception:
await conn.close()
continue
except Exception:
continue
raise Exception("No healthy nodes available")
async def execute(self, query, params):
conn = await self.get_client()
try:
result, _ = await conn.query(query, params)
return result
finally:
await conn.close()
Geode’s GQL implementation aligned with the conformance profile provides a production-ready platform for graph workloads. Whether you’re building social networks, fraud detection systems, recommendation engines, or knowledge graphs, GQL’s declarative syntax and Geode’s optimized execution deliver both developer productivity and runtime performance.