Built-in functions extend GQL with powerful data manipulation, aggregation, and computation capabilities. Geode implements the complete ISO/IEC 39075:2024 function specification plus performance-optimized extensions.
Function Categories
Aggregation Functions
Compute summary values across result sets:
// COUNT - count rows or non-null values
MATCH (u:User)
RETURN COUNT(*) AS total_users,
COUNT(u.email) AS users_with_email
// SUM - sum numeric values
MATCH (o:Order)
RETURN SUM(o.total) AS revenue
// AVG - average of numeric values
MATCH (p:Product)-[:HAS_REVIEW]->(r:Review)
RETURN p.name, AVG(r.rating) AS avg_rating
// MIN/MAX - minimum and maximum values
MATCH (u:User)
RETURN MIN(u.age) AS youngest,
MAX(u.age) AS oldest
// COLLECT - collect values into list
MATCH (u:User)-[:LIKES]->(p:Product)
RETURN u.name, COLLECT(p.name) AS liked_products
// String aggregation
MATCH (t:Tag)
RETURN STRING_AGG(t.name, ', ') AS tag_list
String Functions
Text manipulation and analysis:
// UPPER/LOWER - case conversion
MATCH (u:User)
RETURN UPPER(u.name) AS name_upper,
LOWER(u.email) AS email_lower
// LENGTH - string length
MATCH (p:Post)
WHERE LENGTH(p.content) > 280
RETURN p
// SUBSTRING - extract substring
MATCH (u:User)
RETURN SUBSTRING(u.email, 0, POSITION('@', u.email)) AS username
// TRIM/LTRIM/RTRIM - remove whitespace
MATCH (t:Text)
RETURN TRIM(t.value) AS trimmed
// CONCAT - concatenate strings
MATCH (u:User)
RETURN CONCAT(u.first_name, ' ', u.last_name) AS full_name
// REPLACE - replace substring
MATCH (t:Text)
RETURN REPLACE(t.content, 'old', 'new') AS updated
// SPLIT - split string into list
MATCH (t:Tag)
RETURN SPLIT(t.path, '/') AS path_parts
// Regular expressions
MATCH (p:Product)
WHERE p.sku =~ '^PROD-[0-9]{4}$'
RETURN p
Mathematical Functions
Numeric computations:
// Basic math
RETURN ABS(-42) AS absolute,
CEIL(3.14) AS ceiling,
FLOOR(3.14) AS floor,
ROUND(3.14159, 2) AS rounded,
SIGN(-5) AS sign_value
// Power and roots
RETURN POWER(2, 10) AS power,
SQRT(16) AS square_root,
EXP(1) AS e,
LOG(100) AS natural_log,
LOG10(1000) AS log_base_10
// Trigonometry
RETURN SIN(3.14159) AS sine,
COS(0) AS cosine,
TAN(0.785) AS tangent,
ASIN(0.5) AS arcsine
// Random numbers
RETURN RAND() AS random_0_to_1,
FLOOR(RAND() * 100) AS random_0_to_99
Temporal Functions
Date and time operations:
// Current time functions
RETURN CURRENT_DATE AS today,
CURRENT_TIME AS now_time,
CURRENT_TIMESTAMP AS now_datetime,
NOW() AS timestamp
// Date construction
RETURN DATE('2024-06-15') AS specific_date,
DATETIME('2024-06-15T14:30:00') AS specific_datetime
// Date components
MATCH (e:Event)
RETURN e.date.year AS year,
e.date.month AS month,
e.date.day AS day,
e.date.dayOfWeek AS weekday,
e.date.weekOfYear AS week
// Duration arithmetic
MATCH (e:Event)
WHERE e.timestamp > NOW() - DURATION('P7D') // Last 7 days
RETURN e
// Date formatting
MATCH (e:Event)
RETURN FORMAT_DATETIME(e.timestamp, 'YYYY-MM-DD HH:mm:ss') AS formatted
List Functions
List manipulation and analysis:
// SIZE - list length
MATCH (u:User)
RETURN u.tags, SIZE(u.tags) AS tag_count
// HEAD/LAST - first and last elements
MATCH (u:User)
RETURN HEAD(u.tags) AS first_tag,
LAST(u.tags) AS last_tag
// TAIL - all but first element
MATCH (u:User)
RETURN TAIL(u.tags) AS remaining_tags
// REVERSE - reverse list order
MATCH (u:User)
RETURN REVERSE(u.scores) AS reversed
// RANGE - generate numeric range
RETURN RANGE(1, 10) AS one_to_ten,
RANGE(0, 100, 10) AS tens
// List comprehension
MATCH (u:User)
RETURN [tag IN u.tags WHERE LENGTH(tag) > 5] AS long_tags,
[x IN RANGE(1, 10) | x * x] AS squares
// REDUCE - fold/accumulate
MATCH (o:Order)-[:CONTAINS]->(i:Item)
WITH o, COLLECT(i.price * i.quantity) AS subtotals
RETURN o.id, REDUCE(total = 0, price IN subtotals | total + price) AS order_total
Map Functions
Work with map/object data:
// KEYS - get map keys
MATCH (u:User)
RETURN KEYS(u.settings) AS setting_keys
// VALUES - get map values
MATCH (u:User)
RETURN VALUES(u.settings) AS setting_values
// SIZE - count map entries
MATCH (u:User)
RETURN SIZE(u.profile) AS profile_field_count
// Map access
MATCH (u:User)
RETURN u.settings['theme'] AS theme,
u.settings.notifications AS notifications
Path Functions
Graph path manipulation:
// LENGTH - path length (number of relationships)
MATCH path = (a:User)-[:KNOWS*]->(b:User)
WHERE LENGTH(path) <= 3
RETURN path
// NODES - extract nodes from path
MATCH path = (a:User)-[:KNOWS*]->(b:User)
RETURN NODES(path) AS users_in_path
// RELATIONSHIPS - extract relationships
MATCH path = (a)-[r:KNOWS*]->(b)
RETURN RELATIONSHIPS(path) AS connections
// Path predicates
MATCH path = (a:User)-[:KNOWS*]->(b:User)
WHERE ALL(node IN NODES(path) WHERE node.active = true)
AND NONE(rel IN RELATIONSHIPS(path) WHERE rel.blocked = true)
RETURN path
Type Conversion Functions
Convert between data types:
// To string
RETURN TOSTRING(42) AS str,
TOSTRING(3.14) AS float_str,
TOSTRING(true) AS bool_str
// To integer
RETURN TOINTEGER('42') AS int,
TOINTEGER(3.14) AS truncated
// To float
RETURN TOFLOAT('3.14') AS float,
TOFLOAT(42) AS int_as_float
// To boolean
RETURN TOBOOLEAN('true') AS bool_true,
TOBOOLEAN(1) AS bool_from_int
Conditional Functions
Conditional logic and null handling:
// COALESCE - first non-null value
MATCH (u:User)
RETURN COALESCE(u.nickname, u.username, 'Anonymous') AS display_name
// CASE - conditional expressions
MATCH (u:User)
RETURN u.name,
CASE
WHEN u.age < 18 THEN 'minor'
WHEN u.age < 65 THEN 'adult'
ELSE 'senior'
END AS age_group
// NULLIF - return null if values equal
MATCH (u:User)
RETURN NULLIF(u.display_name, '') AS name // NULL if empty string
Scalar Functions
Single-value utilities:
// ID - get node/relationship ID
MATCH (n:User)
RETURN ID(n) AS node_id
// LABELS - get node labels
MATCH (n)
RETURN LABELS(n) AS node_labels
// TYPE - get relationship type
MATCH ()-[r]->()
RETURN TYPE(r) AS rel_type
// PROPERTIES - get all properties
MATCH (n:User {id: 123})
RETURN PROPERTIES(n) AS user_props
// EXISTS - check existence
MATCH (u:User)
WHERE EXISTS { (u)-[:PURCHASED]->(:Product) }
RETURN u
Hash and Encoding Functions
Cryptographic and encoding operations:
// Hash functions
RETURN SHA256('password') AS hash,
MD5('data') AS md5_hash
// Encoding
RETURN BASE64_ENCODE('Hello World') AS encoded,
BASE64_DECODE('SGVsbG8gV29ybGQ=') AS decoded,
URL_ENCODE('hello world') AS url_encoded
Advanced Function Usage
Window Functions
Compute over partitions:
// Row number within partition
MATCH (u:User)-[:PURCHASED]->(p:Product)
RETURN u.name,
p.name,
ROW_NUMBER() OVER (PARTITION BY u.id ORDER BY p.price DESC) AS rank
// Running totals
MATCH (t:Transaction)
RETURN t.date,
t.amount,
SUM(t.amount) OVER (ORDER BY t.date) AS running_total
// Moving averages
MATCH (m:Measurement)
RETURN m.timestamp,
m.value,
AVG(m.value) OVER (
ORDER BY m.timestamp
ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
) AS moving_avg
User-Defined Functions
Extend with custom functions (Python example):
from geode_client import Client
client = Client(host="localhost", port=3141)
async with client.connection() as conn:
# Register custom function
await conn.execute("""
CREATE FUNCTION haversine_distance(
lat1 FLOAT, lon1 FLOAT,
lat2 FLOAT, lon2 FLOAT
) RETURNS FLOAT
LANGUAGE PYTHON
AS $$
import math
R = 6371 # Earth radius in km
dlat = math.radians(lat2 - lat1)
dlon = math.radians(lon2 - lon1)
a = (math.sin(dlat/2) ** 2 +
math.cos(math.radians(lat1)) *
math.cos(math.radians(lat2)) *
math.sin(dlon/2) ** 2)
c = 2 * math.asin(math.sqrt(a))
return R * c
$$
""")
# Use custom function
result, _ = await conn.query("""
MATCH (loc1:Location {name: 'NYC'}),
(loc2:Location {name: 'SF'})
RETURN haversine_distance(
loc1.lat, loc1.lon,
loc2.lat, loc2.lon
) AS distance_km
""")
Performance Considerations
Function Execution Cost
- Aggregations: Process all matching rows (O(n))
- String operations: Linear in string length
- Math functions: Constant time (O(1))
- Path functions: Linear in path length
Optimization Tips
// Expensive: function in WHERE with no index
MATCH (u:User)
WHERE UPPER(u.email) = 'ADMIN@EXAMPLE.COM'
RETURN u
// Better: index on computed column or filter differently
CREATE INDEX FOR (u:User) ON (u.email);
MATCH (u:User)
WHERE u.email = 'admin@example.com' // Case-insensitive index
RETURN u
// Avoid repeated function calls
MATCH (p:Product)
WITH p, LENGTH(p.description) AS desc_len // Compute once
WHERE desc_len > 100 AND desc_len < 500
RETURN p
Best Practices
- Use Indexes: Filter on indexed properties before applying functions
- Compute Once: Store function results in variables with WITH
- Null Safety: Always handle potential nulls with COALESCE or IS NULL
- Type Compatibility: Ensure function arguments match expected types
- Aggregation Scope: Use GROUP BY explicitly for clarity
- Function Composition: Chain functions for complex transformations
- Performance Testing: Profile expensive function calls on realistic data
Common Patterns
Calculate Derived Properties
MATCH (u:User)
SET u.full_name = CONCAT(u.first_name, ' ', u.last_name),
u.name_length = LENGTH(u.full_name)
Conditional Aggregation
MATCH (u:User)-[:PURCHASED]->(p:Product)
RETURN u.name,
COUNT(p) AS total_purchases,
SUM(CASE WHEN p.price > 100 THEN 1 ELSE 0 END) AS expensive_purchases,
AVG(CASE WHEN p.category = 'Electronics' THEN p.price ELSE NULL END) AS avg_electronics_price
Time-Based Analysis
MATCH (e:Event)
WHERE e.timestamp >= DATE_TRUNC('month', NOW())
WITH DATE_TRUNC('day', e.timestamp) AS day, COUNT(*) AS count
RETURN day, count
ORDER BY day
Graph-Specific Functions
Shortest Path Functions
-- Find shortest path between nodes
MATCH path = SHORTEST_PATH((a:User {id: 1})-[:KNOWS*]-(b:User {id: 100}))
RETURN path, length(path) AS hops;
-- All shortest paths (multiple paths with same length)
MATCH paths = ALL_SHORTEST_PATHS((a)-[:RELATED*]-(b))
RETURN paths;
-- Weighted shortest path
MATCH path = SHORTEST_PATH((a)-[r:ROAD*]-(b))
RETURN path, REDUCE(dist = 0, rel IN relationships(path) | dist + rel.distance) AS total_distance;
Centrality Functions
-- PageRank centrality
CALL graph.pagerank(nodes: [:User], relationships: [:FOLLOWS])
YIELD node, score
RETURN node.name, score
ORDER BY score DESC
LIMIT 10;
-- Betweenness centrality
CALL graph.betweenness([:User], [:KNOWS])
YIELD node, score
RETURN node.name, score;
-- Degree centrality
MATCH (n:User)
RETURN n.name,
SIZE([(n)-[:FOLLOWS]->() | 1]) AS out_degree,
SIZE([()<-[:FOLLOWS]-(n) | 1]) AS in_degree,
SIZE([(n)-[:FOLLOWS]-() | 1]) AS total_degree;
Community Detection
-- Louvain community detection
CALL graph.louvain([:User], [:KNOWS], {iterations: 10})
YIELD node, community
SET node.community = community;
-- Label propagation
CALL graph.label_propagation([:User], [:FOLLOWS])
YIELD node, community
RETURN community, COUNT(node) AS size
ORDER BY size DESC;
Geospatial Functions
Distance Calculations
-- Haversine distance (km)
MATCH (a:Location), (b:Location)
RETURN a.name, b.name,
geospatial.distance(
point({latitude: a.lat, longitude: a.lon}),
point({latitude: b.lat, longitude: b.lon})
) AS distance_km;
-- Find locations within radius
MATCH (center:Location {name: 'New York'})
MATCH (nearby:Location)
WHERE geospatial.distance(
point({latitude: center.lat, longitude: center.lon}),
point({latitude: nearby.lat, longitude: nearby.lon})
) < 100 -- 100 km radius
RETURN nearby.name, nearby.lat, nearby.lon;
Bounding Box Queries
-- Find all locations in bounding box
MATCH (loc:Location)
WHERE loc.lat >= $min_lat AND loc.lat <= $max_lat
AND loc.lon >= $min_lon AND loc.lon <= $max_lon
RETURN loc.name, loc.lat, loc.lon;
JSON Functions
JSON Processing
-- Parse JSON string
MATCH (d:Document)
WITH d, JSON_PARSE(d.content) AS json_obj
RETURN json_obj.title, json_obj.author;
-- Extract JSON field
MATCH (d:Document)
RETURN JSON_EXTRACT(d.metadata, '$.tags[0]') AS first_tag;
-- Build JSON object
MATCH (u:User)
RETURN JSON_OBJECT(
'name', u.name,
'email', u.email,
'age', u.age
) AS user_json;
-- Convert to JSON string
MATCH (u:User)
RETURN JSON_STRINGIFY(u) AS user_string;
Array Functions
-- Array contains
MATCH (u:User)
WHERE ARRAY_CONTAINS(u.tags, 'developer')
RETURN u;
-- Array length
MATCH (u:User)
RETURN u.name, ARRAY_LENGTH(u.interests) AS interest_count;
-- Array slice
MATCH (u:User)
RETURN u.name, ARRAY_SLICE(u.purchases, 0, 5) AS recent_purchases;
-- Array distinct
MATCH (u:User)
SET u.unique_tags = ARRAY_DISTINCT(u.tags);
Statistical Functions
Descriptive Statistics
-- Standard deviation
MATCH (u:User)
RETURN STDDEV(u.age) AS age_stddev,
STDDEV_POP(u.age) AS population_stddev;
-- Variance
MATCH (p:Product)
RETURN VARIANCE(p.price) AS price_variance;
-- Percentiles
MATCH (u:User)
RETURN PERCENTILE_CONT(u.age, 0.25) AS q1,
PERCENTILE_CONT(u.age, 0.50) AS median,
PERCENTILE_CONT(u.age, 0.75) AS q3;
-- Mode (most common value)
MATCH (u:User)
RETURN u.country, COUNT(*) AS frequency
ORDER BY frequency DESC
LIMIT 1;
Correlation
-- Correlation between two properties
MATCH (u:User)
RETURN CORR(u.age, u.purchase_count) AS age_purchase_correlation;
Text Search Functions
Full-Text Search
-- Full-text search
MATCH (p:Post)
WHERE FULLTEXT_SEARCH(p.content, 'graph database')
RETURN p.title, p.content;
-- Fuzzy search
MATCH (u:User)
WHERE FUZZY_MATCH(u.name, 'Alice', 2) -- Edit distance <= 2
RETURN u.name;
-- Soundex matching
MATCH (u:User)
WHERE SOUNDEX(u.last_name) = SOUNDEX('Smith')
RETURN u.name;
Text Analysis
-- Word count
MATCH (p:Post)
RETURN p.title, WORD_COUNT(p.content) AS words;
-- Tokenization
MATCH (t:Text)
RETURN TOKENIZE(t.content, ' ') AS tokens;
-- Stemming
MATCH (w:Word)
RETURN w.text, STEM(w.text) AS stemmed;
Vector Functions
Vector Operations
-- Cosine similarity
MATCH (a:Product {id: 1}), (b:Product)
WHERE b <> a
RETURN b.name,
COSINE_SIMILARITY(a.embedding, b.embedding) AS similarity
ORDER BY similarity DESC
LIMIT 10;
-- Euclidean distance
MATCH (a:Item), (b:Item)
RETURN a.id, b.id,
EUCLIDEAN_DISTANCE(a.features, b.features) AS distance;
-- Dot product
MATCH (v:Vector)
RETURN DOT_PRODUCT(v.vec1, v.vec2) AS dot_prod;
-- Vector magnitude
MATCH (v:Vector)
RETURN MAGNITUDE(v.embedding) AS mag;
Custom Function Development
Python UDFs
# Register custom function
await client.execute("""
CREATE FUNCTION sentiment_score(text STRING)
RETURNS FLOAT
LANGUAGE PYTHON
AS $$
from textblob import TextBlob
def sentiment_score(text):
blob = TextBlob(text)
return blob.sentiment.polarity
$$
""")
# Use custom function
result, _ = await client.query("""
MATCH (p:Post)
RETURN p.content,
sentiment_score(p.content) AS sentiment
ORDER BY sentiment DESC
""")
JavaScript UDFs
-- Create JavaScript function
CREATE FUNCTION calculate_discount(price FLOAT, customer_tier STRING)
RETURNS FLOAT
LANGUAGE JAVASCRIPT
AS $$
function calculate_discount(price, tier) {
const discounts = {
'gold': 0.20,
'silver': 0.10,
'bronze': 0.05
};
return price * (1 - (discounts[tier] || 0));
}
$$;
-- Use function
MATCH (p:Product)
RETURN p.name,
p.price,
calculate_discount(p.price, 'gold') AS discounted_price;
Performance Tips for Functions
Indexed Function Results
-- Create index on function result
CREATE INDEX user_name_upper ON User(UPPER(name));
-- Efficient case-insensitive search
MATCH (u:User)
WHERE UPPER(u.name) = UPPER($search_name)
RETURN u;
Materialized Function Results
-- Pre-compute expensive functions
MATCH (u:User)
SET u.full_name_normalized = UPPER(TRIM(CONCAT(u.first_name, ' ', u.last_name)));
-- Later queries are fast
MATCH (u:User {full_name_normalized: 'ALICE SMITH'})
RETURN u;
Avoid Repeated Function Calls
-- Inefficient: LENGTH called twice
MATCH (p:Product)
WHERE LENGTH(p.description) > 100 AND LENGTH(p.description) < 500
RETURN p;
-- Efficient: Compute once with WITH
MATCH (p:Product)
WITH p, LENGTH(p.description) AS desc_len
WHERE desc_len > 100 AND desc_len < 500
RETURN p;
Function Reference Summary
Aggregation
- COUNT, SUM, AVG, MIN, MAX, COLLECT, STRING_AGG
- STDDEV, VARIANCE, PERCENTILE_CONT
String
- UPPER, LOWER, LENGTH, SUBSTRING, TRIM, CONCAT, REPLACE, SPLIT
- FULLTEXT_SEARCH, FUZZY_MATCH, SOUNDEX
Math
- ABS, CEIL, FLOOR, ROUND, SIGN, POWER, SQRT, EXP, LOG
- SIN, COS, TAN, RAND
Temporal
- CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, NOW
- DATE, DATETIME, DURATION, DATE_TRUNC
List
- SIZE, HEAD, LAST, TAIL, REVERSE, RANGE, REDUCE
Type Conversion
- TOSTRING, TOINTEGER, TOFLOAT, TOBOOLEAN
Conditional
- COALESCE, CASE, NULLIF
Graph
- SHORTEST_PATH, ALL_SHORTEST_PATHS, NODES, RELATIONSHIPS
- PageRank, Betweenness, Louvain
Geospatial
- geospatial.distance, point
Vector
- COSINE_SIMILARITY, EUCLIDEAN_DISTANCE, DOT_PRODUCT
Related Topics
- GQL Operators : Operators work alongside functions in expressions
- Data Types : Understand type compatibility for function arguments
- Query Optimization : Optimize function usage for performance
- Aggregation : Deep dive into aggregation patterns
- GQL Reference : Complete function documentation
- Stored Procedures : Creating custom functions
- Graph Algorithms : Algorithm implementations
Further Reading
- GQL Operators - Operators and expressions
- Data Types - Type system and conversions
- Query Performance - Optimize function calls
- Aggregation Patterns - Advanced aggregation techniques
- GQL Reference - Complete language documentation
- Function Performance Guide - Optimization strategies
- Custom Function Development - UDF best practices
Master Geode’s built-in functions to write expressive, powerful queries that transform and analyze your graph data efficiently.