Category: Graph Databases 101 | Categories

The Graph Databases 101 category provides foundational knowledge for understanding graph databases, their unique characteristics, and when to use them. Whether you’re new to graph databases or coming from relational databases, these resources help you build a solid conceptual foundation.

Introduction to Graph Databases

Graph databases represent and store data as networks of interconnected nodes and relationships. Unlike traditional relational databases that use tables and foreign keys, graph databases make relationships first-class citizens of the data model, enabling natural representation of connected data and efficient traversal queries.

The graph database paradigm has gained prominence as applications increasingly need to model and query complex, interconnected data: social networks, recommendation engines, fraud detection systems, knowledge graphs, and network topology analysis all benefit from graph database technology.

Geode implements the ISO/IEC 39075:2024 Graph Query Language (GQL) standard, providing a standardized, SQL-like approach to working with graph data. This makes graph databases more accessible to developers familiar with relational databases while preserving the power and flexibility of graph thinking.

What is a Graph Database?

The Graph Data Model

A graph consists of three fundamental elements:

Nodes (also called vertices):

Represent entities in your domain (people, products, places, concepts)
Can have one or more labels categorizing their type
Store properties as key-value pairs

(alice:Person {
    name: 'Alice Anderson',
    age: 30,
    email: 'alice@example.com'
})

Relationships (also called edges):

Connect two nodes with a specific, meaningful connection
Have a type that describes the nature of the connection
Are directed (have a start node and end node)
Can store properties describing the relationship

-[:KNOWS {
    since: 2020,
    relationship_type: 'colleague'
}]->

Properties:

Key-value pairs attached to nodes or relationships
Support various data types: strings, numbers, booleans, dates, lists, maps
Enable rich descriptions of entities and connections

A Simple Example

Representing a small social network:

-- Create people
CREATE (alice:Person {name: 'Alice', city: 'New York'});
CREATE (bob:Person {name: 'Bob', city: 'San Francisco'});
CREATE (carol:Person {name: 'Carol', city: 'New York'});

-- Create relationships
MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
CREATE (a)-[:KNOWS {since: 2020}]->(b);

MATCH (a:Person {name: 'Alice'}), (c:Person {name: 'Carol'})
CREATE (a)-[:KNOWS {since: 2019}]->(c);

-- Query: Who does Alice know?
MATCH (alice:Person {name: 'Alice'})-[:KNOWS]->(friend)
RETURN friend.name, friend.city;

Why Use a Graph Database?

Relationship-Centric Data

Problem: Traditional databases struggle with deeply connected data.

Relational approach (SQL):

-- Multiple self-joins get expensive quickly
SELECT DISTINCT p4.name
FROM persons p1
JOIN friendships f1 ON p1.id = f1.person1_id
JOIN persons p2 ON f1.person2_id = p2.id
JOIN friendships f2 ON p2.id = f2.person1_id
JOIN persons p3 ON f2.person2_id = p3.id
JOIN friendships f3 ON p3.id = f3.person1_id
JOIN persons p4 ON f3.person2_id = p4.id
WHERE p1.name = 'Alice';
-- Friends of friends of friends: 3 joins

Graph approach (GQL):

-- Natural, efficient pattern matching
MATCH (alice:Person {name: 'Alice'})-[:KNOWS*1..3]->(connection)
RETURN DISTINCT connection.name;
-- Friends up to 3 hops: simple and fast

Variable-Depth Queries

Graph databases excel at queries where the depth of traversal isn’t known in advance:

-- Find all reporting relationships up the management chain
MATCH (employee:Person {name: 'Alice'})-[:REPORTS_TO*]->(manager)
RETURN manager.name, manager.title;

-- Find shortest path between two people
MATCH path = shortestPath(
    (a:Person {name: 'Alice'})-[:KNOWS*]-(b:Person {name: 'Zach'})
)
RETURN path;

These queries are impractical or prohibitively expensive in relational databases.

Schema Flexibility

Graph databases offer flexible schemas that can evolve:

-- Different types of entities coexist naturally
CREATE (p:Person {name: 'Alice'});
CREATE (c:Company {name: 'Acme Corp'});
CREATE (pr:Product {name: 'Widget'});

-- Relationships can connect any nodes
MATCH (p:Person {name: 'Alice'}), (c:Company {name: 'Acme Corp'})
CREATE (p)-[:WORKS_AT]->(c);

MATCH (c:Company {name: 'Acme Corp'}), (pr:Product {name: 'Widget'})
CREATE (c)-[:PRODUCES]->(pr);

-- Add new properties without migrations
MATCH (p:Person {name: 'Alice'})
SET p.department = 'Engineering', p.start_date = date('2024-01-15');

Graph Databases vs. Other Database Types

vs. Relational Databases (SQL)

Relational Databases (PostgreSQL, MySQL, Oracle):

Data Model: Tables with rows and columns
Relationships: Foreign keys and JOINs
Schema: Fixed schema with migrations
Best For: Structured data, reporting, transactions on single entities
Weak At: Many-to-many relationships, variable-depth queries

Graph Databases (Geode, Neo4j):

Data Model: Nodes and relationships
Relationships: First-class, directly traversable
Schema: Flexible, evolves with your application
Best For: Connected data, pattern matching, relationship analysis
Weak At: Aggregate calculations across entire dataset

When to Choose Graph:

Relationship queries are primary workload
Need to traverse variable depths
Data model evolves frequently
Connections between entities are as important as the entities themselves

vs. Document Databases (NoSQL)

Document Databases (MongoDB, CouchDB):

Data Model: JSON/BSON documents
Relationships: Embedded documents or references
Query Style: Key-value lookups, nested object queries
Best For: Hierarchical data, flexible schemas, read-heavy workloads

Graph Databases:

Data Model: Nodes and relationships
Relationships: Bidirectional, multi-hop traversals
Query Style: Pattern matching, path finding
Best For: Interconnected data, relationship analysis

Example: Representing a blog with comments

Document Database:

{
  "post_id": "post1",
  "title": "Graph Databases are Great",
  "author": "Alice",
  "comments": [
    {"user": "Bob", "text": "Great article!"},
    {"user": "Carol", "text": "Very informative"}
  ]
}

Simple to query a post’s comments
Difficult to find all comments by a user across all posts
No representation of relationships between users

Graph Database:

CREATE (alice:User {name: 'Alice'});
CREATE (post:Post {title: 'Graph Databases are Great'});
CREATE (bob:User {name: 'Bob'});
CREATE (comment:Comment {text: 'Great article!'});

CREATE (alice)-[:AUTHORED]->(post);
CREATE (bob)-[:COMMENTED]->(comment);
CREATE (comment)-[:ON_POST]->(post);

Easy to find all posts by a user
Easy to find all comments by a user
Can explore user relationships through interactions

vs. Key-Value Stores

Key-Value Stores (Redis, DynamoDB):

Data Model: Simple key-value pairs
Access Pattern: Direct key lookup, extremely fast
Best For: Caching, session storage, simple data structures

Graph Databases:

Data Model: Complex connected structures
Access Pattern: Pattern matching and traversals
Best For: Relationship queries and analytics

Use Together: Key-value stores for caching, graphs for relationship modeling.

Common Graph Database Use Cases

Represent users, posts, likes, comments, and friendships:

-- Find mutual friends
MATCH (me:User {id: 'user1'})-[:FRIENDS_WITH]-(mutual)-[:FRIENDS_WITH]-(them:User {id: 'user2'})
RETURN mutual.name;

-- Suggest friends (friends of friends not yet connected)
MATCH (me:User {id: 'user1'})-[:FRIENDS_WITH*2]-(suggested)
WHERE NOT (me)-[:FRIENDS_WITH]-(suggested) AND suggested.id <> 'user1'
RETURN suggested.name, count(*) as mutual_friends
ORDER BY mutual_friends DESC;

Recommendation Engines

Collaborative filtering and content-based recommendations:

-- Product recommendations based on similar users
MATCH (me:User {id: 'user1'})-[:PURCHASED]->(p:Product)
MATCH (p)<-[:PURCHASED]-(similar:User)-[:PURCHASED]->(rec:Product)
WHERE NOT (me)-[:PURCHASED]->(rec)
RETURN rec.name, count(*) as score
ORDER BY score DESC
LIMIT 10;

Fraud Detection

Identify suspicious patterns in transactions and relationships:

-- Find accounts sharing suspicious attributes
MATCH (a1:Account)-[:HAS_PHONE]->(phone:Phone)<-[:HAS_PHONE]-(a2:Account)
MATCH (a1)-[:HAS_EMAIL]->(email:Email)<-[:HAS_EMAIL]-(a2)
WHERE a1.id <> a2.id
RETURN a1.id, a2.id, phone.number, email.address;

-- Detect circular transaction patterns
MATCH path = (a:Account)-[:TRANSFERRED_TO*3..5]->(a)
WHERE all(r IN relationships(path) WHERE r.amount > 1000)
RETURN path;

Knowledge Graphs

Represent complex domain knowledge and relationships:

-- Medical knowledge graph
CREATE (disease:Disease {name: 'Type 2 Diabetes'});
CREATE (symptom:Symptom {name: 'Increased thirst'});
CREATE (treatment:Treatment {name: 'Metformin'});
CREATE (gene:Gene {name: 'TCF7L2'});

CREATE (disease)-[:HAS_SYMPTOM]->(symptom);
CREATE (disease)-[:TREATED_WITH]->(treatment);
CREATE (disease)-[:ASSOCIATED_WITH]->(gene);

-- Query: What are treatment options for diseases associated with a gene?
MATCH (gene:Gene {name: 'TCF7L2'})<-[:ASSOCIATED_WITH]-(disease)-[:TREATED_WITH]->(treatment)
RETURN disease.name, collect(treatment.name) as treatments;

Network and IT Operations

Model network topology and dependencies:

-- Infrastructure dependencies
CREATE (web:Server {name: 'web-server-1', type: 'nginx'});
CREATE (app:Server {name: 'app-server-1', type: 'nodejs'});
CREATE (db:Server {name: 'db-server-1', type: 'geode'});

CREATE (web)-[:DEPENDS_ON]->(app);
CREATE (app)-[:DEPENDS_ON]->(db);

-- Find all services impacted if a server goes down
MATCH (failed:Server {name: 'db-server-1'})<-[:DEPENDS_ON*]-(impacted)
RETURN impacted.name, impacted.type;

Core Graph Concepts

Directed vs. Undirected Relationships

Directed (most common in graph databases):

(alice)-[:FOLLOWS]->(bob)  -- Alice follows Bob, but Bob doesn't follow Alice

Undirected (represented as bidirectional):

-- Create both directions for undirected relationships
CREATE (a)-[:FRIENDS_WITH]->(b), (b)-[:FRIENDS_WITH]->(a);

-- Or query ignoring direction
MATCH (a:Person)-[:FRIENDS_WITH]-(b:Person)

Relationship Properties

Relationships can carry data:

CREATE (alice)-[:KNOWS {
    since: 2020,
    strength: 'strong',
    met_at: 'University',
    interaction_count: 150
}]->(bob);

-- Query using relationship properties
MATCH (p:Person)-[k:KNOWS]->(f:Person)
WHERE k.since >= 2020 AND k.strength = 'strong'
RETURN p.name, f.name, k.met_at;

Multiple Labels

Nodes can have multiple labels:

CREATE (alice:Person:Employee:Manager {
    name: 'Alice',
    employee_id: 'E123',
    department: 'Engineering'
});

-- Query using specific labels
MATCH (m:Manager)
RETURN m.name;

MATCH (e:Employee)
WHERE e.department = 'Engineering'
RETURN count(e);

Path Queries

Find paths through the graph:

-- Shortest path
MATCH path = shortestPath((a:Person {name: 'Alice'})-[:KNOWS*]-(z:Person {name: 'Zach'}))
RETURN path;

-- All paths up to length 5
MATCH path = (a:Person {name: 'Alice'})-[:KNOWS*1..5]-(connection)
RETURN path;

-- Paths matching complex patterns
MATCH path = (p:Person)-[:WORKS_AT]->(c:Company)-[:PRODUCES]->(pr:Product)
RETURN path;

The Graph Query Language (GQL)

ISO Standard

Geode implements ISO/IEC 39075:2024, the international standard for graph databases. This provides:

Standardization: Like SQL for relational databases
Portability: Skills and queries transfer between compliant databases
Maturity: Based on decades of graph database experience
Future-Proof: Industry-backed standard ensures long-term support

Key GQL Concepts

Pattern Matching:

-- The heart of GQL: matching patterns in the graph
MATCH (p:Person)-[:WORKS_AT]->(c:Company)
WHERE c.name = 'Acme Corp'
RETURN p.name;

Property Access:

-- Access node and relationship properties
MATCH (p:Person)
RETURN p.name, p.age, p.email;

Filtering:

-- WHERE clause for conditions
MATCH (p:Person)
WHERE p.age > 30 AND p.city = 'New York'
RETURN p;

Aggregation:

-- COUNT, SUM, AVG, MIN, MAX
MATCH (c:Company)<-[:WORKS_AT]-(e:Employee)
RETURN c.name, count(e) as employee_count
ORDER BY employee_count DESC;

Graph Database Benefits

Performance

Relationship queries are O(1):

Following a relationship is a pointer lookup, not a table scan
Performance doesn’t degrade with data size for connected queries
Linear scaling for variable-depth traversals

Index-free adjacency:

Each node directly references its relationships
No index lookups needed to traverse
Constant-time relationship following

Developer Productivity

Natural modeling:

Graph structure mirrors how we think about connected data
No impedance mismatch between model and database
Visual representation aids understanding

Expressive queries:

-- Complex relationships expressed simply
MATCH (user:User {id: 'user1'})-[:PURCHASED]->(product:Product)
MATCH (product)<-[:PURCHASED]-(other:User)-[:PURCHASED]->(recommendation:Product)
WHERE NOT (user)-[:PURCHASED]->(recommendation)
RETURN DISTINCT recommendation
LIMIT 10;

Flexibility

Schema evolution:

Add new node types without altering existing data
Add new relationship types as requirements evolve
Add properties to nodes and relationships dynamically

Polyglot persistence:

Use graphs for connected data alongside other database types
Complement relational databases for specific use cases
Integrate with document stores, caches, and data warehouses

Graph Database Challenges

When Not to Use Graphs

High-volume aggregate operations:

Summing all sales across millions of records
Statistical analysis on entire datasets
Better served by columnar or relational databases

Simple key-value lookups:

Session storage, caching
Better served by key-value stores like Redis

Hierarchical data without relationships:

Simple document structures
Better served by document databases

Time-series data:

IoT sensor readings, logs
Better served by time-series databases

Learning Curve

Different thinking:

Requires shift from tabular thinking to graph thinking
Pattern matching vs. SQL joins
Understanding graph algorithms and traversals

Solution: Start with simple models, gradually add complexity as understanding grows.

Getting Started with Graph Thinking

Step 1: Identify Entities and Relationships

Example: Modeling an e-commerce system

Entities (nodes):

Customers
Products
Orders
Categories
Vendors

Relationships:

Customer PLACED Order
Order CONTAINS Product
Product IN_CATEGORY Category
Vendor SUPPLIES Product

Step 2: Add Properties

Nodes:

(customer:Customer {
    id: 'C123',
    name: 'Alice',
    email: 'alice@example.com',
    join_date: date('2024-01-15')
})

(product:Product {
    id: 'P456',
    name: 'Laptop',
    price: 1299.99,
    stock: 50
})

Relationships:

-[:PLACED {
    order_date: datetime('2024-03-15T10:30:00'),
    total_amount: 1299.99
}]->

-[:CONTAINS {
    quantity: 1,
    unit_price: 1299.99
}]->

Step 3: Define Queries

What questions do you need to answer?

-- Customer purchase history
MATCH (c:Customer {id: 'C123'})-[:PLACED]->(o:Order)-[:CONTAINS]->(p:Product)
RETURN p.name, o.order_date;

-- Products frequently bought together
MATCH (p1:Product)<-[:CONTAINS]-(o:Order)-[:CONTAINS]->(p2:Product)
WHERE p1.id = 'P456' AND p1.id <> p2.id
RETURN p2.name, count(*) as frequency
ORDER BY frequency DESC;

-- Customer lifetime value
MATCH (c:Customer {id: 'C123'})-[placed:PLACED]->(o:Order)
RETURN sum(placed.total_amount) as lifetime_value;

Best Practices for Beginners

1. Start Simple

Begin with a small subset of your domain:

-- Phase 1: Basic entities
CREATE (alice:User {name: 'Alice'});
CREATE (bob:User {name: 'Bob'});
CREATE (alice)-[:KNOWS]->(bob);

-- Phase 2: Add properties
SET alice.email = 'alice@example.com', alice.joined = date('2024-01-15');

-- Phase 3: Add more relationship types
CREATE (alice)-[:FOLLOWS]->(bob);

-- Phase 4: Expand the model
CREATE (post:Post {content: 'Hello world!'});
CREATE (alice)-[:POSTED]->(post);

2. Use Meaningful Names

-- Good: Clear, descriptive names
CREATE (user:User {name: 'Alice'})-[:PURCHASED {date: date('2024-03-15')}]->(product:Product {name: 'Laptop'});

-- Bad: Unclear abbreviations
CREATE (u:U {n: 'Alice'})-[:P {d: '2024-03-15'}]->(pr:Pr {n: 'Laptop'});

3. Create Indexes Early

-- Create indexes on frequently queried properties
CREATE INDEX user_email ON :User(email);
CREATE INDEX product_sku ON :Product(sku);
CREATE CONSTRAINT unique_user_id ON :User(id);

4. Visualize Your Model

Draw your graph model before implementing:

[User] -PURCHASED-> [Order] -CONTAINS-> [Product]
  |                                        |
  +-REVIEWED----------------------------->+

Getting Started: Practical guides to begin using Geode
Graph Modeling: Design patterns and modeling techniques
Query Language: Deep dive into GQL syntax and features
Examples: Practical code samples and applications
Use Cases: Industry-specific graph database applications

Introduction to Graph Databases Share link

What is a Graph Database? Share link

The Graph Data Model Share link

A Simple Example Share link

Why Use a Graph Database? Share link

Relationship-Centric Data Share link

Variable-Depth Queries Share link

Schema Flexibility Share link

Graph Databases vs. Other Database Types Share link

vs. Relational Databases (SQL) Share link

vs. Document Databases (NoSQL) Share link

vs. Key-Value Stores Share link

Common Graph Database Use Cases Share link

Social Networks Share link

Recommendation Engines Share link

Fraud Detection Share link

Knowledge Graphs Share link

Network and IT Operations Share link

Core Graph Concepts Share link

Directed vs. Undirected Relationships Share link

Relationship Properties Share link

Multiple Labels Share link

Path Queries Share link

The Graph Query Language (GQL) Share link

ISO Standard Share link

Key GQL Concepts Share link

Graph Database Benefits Share link

Performance Share link

Developer Productivity Share link

Flexibility Share link

Graph Database Challenges Share link

When Not to Use Graphs Share link

Learning Curve Share link

Getting Started with Graph Thinking Share link

Step 1: Identify Entities and Relationships Share link

Step 2: Add Properties Share link

Step 3: Define Queries Share link

Best Practices for Beginners Share link

1. Start Simple Share link

2. Use Meaningful Names Share link

3. Create Indexes Early Share link

4. Visualize Your Model Share link

Related Topics Share link

Further Reading Share link

Related Articles

Data Model and Types

Introduction and Key Features

Transactions and Data Integrity