Property Graph Model

The property graph model is the most widely adopted graph database model, used by Geode, Neo4j, Amazon Neptune, TigerGraph, and many others. It provides an intuitive, flexible way to model complex, interconnected data by representing entities as nodes and connections as relationships, with both capable of holding rich property data.

Core Concepts

Nodes (Vertices)

Nodes represent entities in your domain model: people, products, locations, events, documents, or any discrete object.

Characteristics:

  • Unique identity (typically an ID)
  • Zero or more labels for categorization
  • Zero or more properties (key-value pairs)
-- Simple node
(p:Person)

-- Node with properties
(p:Person {
  id: 'user_12345',
  name: 'Alice Johnson',
  email: 'alice@example.com',
  age: 32,
  city: 'San Francisco',
  joined: DATE '2024-01-15',
  verified: true
})

-- Node with multiple labels
(e:Person:Employee:Manager {
  id: 'emp_789',
  name: 'Bob Smith',
  department: 'Engineering'
})

Relationships (Edges)

Relationships connect nodes and represent how entities relate to each other.

Characteristics:

  • Always directed (source → target)
  • Always have a type/label
  • Connect exactly two nodes (source and target)
  • Can have properties
-- Simple relationship
(alice:Person)-[:KNOWS]->(bob:Person)

-- Relationship with properties
(alice)-[:KNOWS {
  since: DATE '2020-05-15',
  context: 'college',
  strength: 0.85
}]->(bob)

-- Multiple relationships between same nodes
(alice)-[:FOLLOWS]->(bob)
(alice)-[:WORKS_WITH]->(bob)
(alice)-[:KNOWS]->(bob)

Properties

Properties are key-value pairs that store data on nodes and relationships.

Supported Data Types (in Geode):

  • Primitives: INTEGER, FLOAT, STRING, BOOLEAN
  • Temporal: DATE, TIME, TIMESTAMP, DURATION
  • Complex: LIST, MAP
  • Special: NULL
-- Various property types
INSERT (p:Product {
  id: 'prod_123',                          -- STRING
  name: 'Wireless Headphones',             -- STRING
  price: 99.99,                            -- FLOAT
  stock: 47,                               -- INTEGER
  available: true,                         -- BOOLEAN
  released: DATE '2024-03-15',             -- DATE
  last_updated: CURRENT_TIMESTAMP,         -- TIMESTAMP
  tags: ['audio', 'wireless', 'premium'],  -- LIST
  specs: {
    battery: '30 hours',
    weight: '250g',
    bluetooth: '5.3'
  }                                        -- MAP
});

Labels

Labels categorize nodes into types or groups, enabling efficient filtering and schema organization.

Single Label:

INSERT (u:User {name: 'Alice'});
MATCH (u:User) RETURN u.name;

Multiple Labels:

-- Node can have multiple labels
INSERT (e:Person:Employee:Manager {name: 'Bob'});

-- Query by any label
MATCH (p:Person) WHERE p.name = 'Bob' RETURN p;
MATCH (m:Manager) WHERE m.name = 'Bob' RETURN m;

Directed vs. Undirected Relationships

The property graph model uses directed relationships by default, meaning every relationship has a source and target node.

Directed Relationships

-- Alice follows Bob (unidirectional)
(alice:Person)-[:FOLLOWS]->(bob:Person)

-- Query respects direction
MATCH (a:Person {name: 'Alice'})-[:FOLLOWS]->(followed)
RETURN followed.name;  -- Returns 'Bob'

MATCH (a:Person {name: 'Alice'})<-[:FOLLOWS]-(follower)
RETURN follower.name;  -- Returns nothing (no one follows Alice in this example)

Modeling Bidirectional Relationships

Option 1: Create two relationships

-- Mutual friendship
INSERT (alice)-[:FRIENDS_WITH]->(bob);
INSERT (bob)-[:FRIENDS_WITH]->(alice);

Option 2: Query without direction

-- Query both directions with undirected pattern
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH]-(friend)
RETURN friend.name;
-- Returns friends regardless of direction

Option 3: Use symmetric relationship type

-- Create once, query bidirectionally
INSERT (alice)-[:MARRIED_TO]->(bob);

-- Query either direction
MATCH (a:Person {name: 'Alice'})-[:MARRIED_TO]-(spouse)
RETURN spouse.name;  -- Returns 'Bob'

Schema Design Patterns

Entity Modeling

Rule of Thumb: If it’s a “thing” in your domain, it’s probably a node.

-- E-commerce domain model
INSERT (customer:Customer {id: 'c123', name: 'Alice'});
INSERT (product:Product {id: 'p456', name: 'Laptop', price: 1299});
INSERT (order:Order {id: 'o789', date: CURRENT_DATE, total: 1299});
INSERT (category:Category {name: 'Electronics'});

-- Relationships define the structure
INSERT (customer)-[:PLACED]->(order);
INSERT (order)-[:CONTAINS {quantity: 1}]->(product);
INSERT (product)-[:IN_CATEGORY]->(category);

Relationship Types

Choose descriptive, action-oriented relationship types:

Good Examples:

  • :KNOWS, :FOLLOWS, :LIKES
  • :WORKS_AT, :REPORTS_TO, :MANAGES
  • :PURCHASED, :CONTAINS, :SHIPPED_TO
  • :DEPENDS_ON, :IMPLEMENTS, :EXTENDS

Avoid:

  • Generic types like :RELATED_TO, :CONNECTED_TO
  • Nouns instead of verbs: :FRIEND (use :FRIENDS_WITH)

When to Use Nodes vs. Properties

Use a Node When:

  • Entity is referenced by multiple other entities
  • Entity has its own relationships
  • Entity has complex attributes
  • You need to query by this entity

Use a Property When:

  • Simple attribute value
  • Not queried independently
  • Not shared across entities
  • No relationships of its own
-- GOOD: Address as node (shared, has relationships)
INSERT (alice:Person {name: 'Alice'});
INSERT (addr:Address {street: '123 Main St', city: 'SF', zip: '94102'});
INSERT (alice)-[:LIVES_AT]->(addr);
-- Multiple people can live at same address

-- GOOD: Email as property (simple, unique to person)
INSERT (bob:Person {name: 'Bob', email: 'bob@example.com'});

-- BAD: Address as property (can't be shared)
INSERT (charlie:Person {
  name: 'Charlie',
  address: '123 Main St, SF 94102'  -- Can't share or query efficiently
});

Hierarchical Data

Tree Structures:

-- Organization hierarchy
INSERT (ceo:Employee {name: 'CEO', title: 'Chief Executive'});
INSERT (cto:Employee {name: 'CTO', title: 'Chief Technology Officer'});
INSERT (dev1:Employee {name: 'Alice', title: 'Senior Developer'});
INSERT (dev2:Employee {name: 'Bob', title: 'Developer'});

INSERT (cto)-[:REPORTS_TO]->(ceo);
INSERT (dev1)-[:REPORTS_TO]->(cto);
INSERT (dev2)-[:REPORTS_TO]->(dev1);

-- Query hierarchy
MATCH path = (e:Employee)-[:REPORTS_TO*]->(ceo:Employee {title: 'Chief Executive'})
RETURN e.name, length(path) AS levels_from_ceo
ORDER BY levels_from_ceo;

Category Hierarchies:

-- Product categories
INSERT (electronics:Category {name: 'Electronics'});
INSERT (computers:Category {name: 'Computers'});
INSERT (laptops:Category {name: 'Laptops'});

INSERT (laptops)-[:SUBCATEGORY_OF]->(computers);
INSERT (computers)-[:SUBCATEGORY_OF]->(electronics);

-- Find all products in category hierarchy
MATCH (product:Product)-[:IN_CATEGORY]->(category:Category)
WHERE category.name = 'Laptops'
   OR (category)-[:SUBCATEGORY_OF*]->(parent:Category {name: 'Laptops'})
RETURN product.name;

Many-to-Many Relationships

Graph models naturally represent many-to-many relationships without junction tables.

-- Students and courses (many-to-many)
INSERT (alice:Student {name: 'Alice'});
INSERT (bob:Student {name: 'Bob'});
INSERT (math:Course {name: 'Mathematics 101'});
INSERT (cs:Course {name: 'Computer Science 101'});

-- Direct relationships
INSERT (alice)-[:ENROLLED_IN {semester: 'Fall 2024', grade: NULL}]->(math);
INSERT (alice)-[:ENROLLED_IN {semester: 'Fall 2024', grade: NULL}]->(cs);
INSERT (bob)-[:ENROLLED_IN {semester: 'Fall 2024', grade: NULL}]->(math);

-- Query: students in a course
MATCH (s:Student)-[:ENROLLED_IN]->(c:Course {name: 'Mathematics 101'})
RETURN s.name;

-- Query: courses for a student
MATCH (s:Student {name: 'Alice'})-[:ENROLLED_IN]->(c:Course)
RETURN c.name;

Reified Relationships (Hyperedges)

When relationships need their own relationships, convert them to nodes.

-- BAD: Can't attach relationships to relationships directly
(alice)-[:WORKED_ON {role: 'Developer'}]->(project)
-- How to represent: manager approved this assignment?

-- GOOD: Reify relationship as node
INSERT (assignment:Assignment {
  role: 'Developer',
  start_date: DATE '2024-01-15',
  status: 'active'
});
INSERT (alice:Person)-[:ASSIGNED]->(assignment);
INSERT (assignment)-[:FOR_PROJECT]->(project:Project);
INSERT (manager:Person)-[:APPROVED]->(assignment);
INSERT (assignment)-[:APPROVED_BY]->(manager);

Temporal Modeling

Model data that changes over time.

Version 1: Properties on Relationships:

-- Simple: dates on relationship
INSERT (alice)-[:EMPLOYED_BY {
  start_date: DATE '2024-01-15',
  end_date: NULL,
  position: 'Engineer'
}]->(company);

Version 2: Separate Time Periods:

-- Complex: multiple employment periods
INSERT (employment:Employment {
  position: 'Junior Engineer',
  start_date: DATE '2024-01-15',
  end_date: DATE '2024-06-30'
});
INSERT (alice:Person)-[:HAD_EMPLOYMENT]->(employment);
INSERT (employment)-[:AT_COMPANY]->(company:Company);

INSERT (employment2:Employment {
  position: 'Senior Engineer',
  start_date: DATE '2024-07-01',
  end_date: NULL  -- Current
});
INSERT (alice)-[:HAD_EMPLOYMENT]->(employment2);
INSERT (employment2)-[:AT_COMPANY]->(company);

Provenance and Metadata

Track data lineage and metadata.

-- Provenance pattern
INSERT (data:DataSet {name: 'Customer Analytics Q1 2024'});
INSERT (source1:DataSource {name: 'CRM Export', type: 'CSV'});
INSERT (source2:DataSource {name: 'Web Analytics', type: 'API'});
INSERT (transform:Transformation {
  script: 'etl_pipeline_v2.py',
  executed_at: CURRENT_TIMESTAMP,
  executed_by: 'etl_service'
});

INSERT (data)-[:DERIVED_FROM {timestamp: CURRENT_TIMESTAMP}]->(transform);
INSERT (transform)-[:USED_SOURCE]->(source1);
INSERT (transform)-[:USED_SOURCE]->(source2);

Advanced Modeling Patterns

Super Nodes (Hub Nodes)

Avoid nodes with millions of relationships (degrades performance).

Problem:

-- BAD: Country node connected to millions of users
INSERT (user1)-[:LIVES_IN]->(usa:Country);
INSERT (user2)-[:LIVES_IN]->(usa:Country);
-- ... millions more

Solution 1: Use property instead

-- Store country as property, index it
INSERT (user:User {name: 'Alice', country: 'USA'});
CREATE INDEX user_country ON User(country);

Solution 2: Intermediate grouping nodes

-- Add state/region level
INSERT (user)-[:LIVES_IN]->(state:State {name: 'California'});
INSERT (state)-[:IN_COUNTRY]->(country:Country {name: 'USA'});

Qualified Relationships

Add context to relationships through properties.

-- Social network with relationship context
INSERT (alice)-[:KNOWS {
  context: 'colleague',
  strength: 0.8,
  since: DATE '2022-03-15',
  last_interaction: CURRENT_TIMESTAMP
}]->(bob);

-- Query by relationship qualities
MATCH (a:Person {name: 'Alice'})-[k:KNOWS]->(friend)
WHERE k.context = 'colleague' AND k.strength > 0.5
RETURN friend.name, k.since;

Multi-Graph Modeling

Separate logical graphs within one database.

-- Use properties to segment graphs
INSERT (alice:Person {tenant_id: 'company_a', name: 'Alice'});
INSERT (bob:Person {tenant_id: 'company_b', name: 'Bob'});

-- Query single tenant
MATCH (p:Person {tenant_id: 'company_a'})
RETURN p.name;

-- Or use row-level security
CREATE POLICY tenant_isolation ON ALL
USING (tenant_id = current_tenant());

Schema Evolution

The property graph model supports flexible schema evolution.

Adding New Properties

-- Existing nodes
INSERT (user:User {name: 'Alice', email: 'alice@example.com'});

-- Later: add new property to specific nodes
MATCH (u:User {name: 'Alice'})
SET u.phone = '+1-555-0100';

-- New nodes include new property
INSERT (user2:User {
  name: 'Bob',
  email: 'bob@example.com',
  phone: '+1-555-0200'
});

Adding New Labels

-- Add label to existing node
MATCH (u:User {email: 'alice@example.com'})
SET u:PremiumUser;

-- Now node has both labels
MATCH (u:User:PremiumUser)
RETURN u.name;

Adding New Relationship Types

-- Existing relationships
INSERT (alice)-[:FOLLOWS]->(bob);

-- Add new relationship type
INSERT (alice)-[:BLOCKS]->(spammer:User {name: 'Spammer'});

-- Both relationship types coexist
MATCH (alice:User {name: 'Alice'})-[r]-(other)
RETURN type(r), other.name;

Best Practices

1. Model for Your Queries

Design your graph based on how you’ll query it, not just how you think about the domain.

-- If you frequently need "products a user might like"
-- Model user preferences and product attributes to support this query
MATCH (u:User {id: $user_id})-[:LIKES]->(pref:Preference)
MATCH (p:Product)-[:HAS_ATTRIBUTE]->(pref)
WHERE NOT EXISTS { MATCH (u)-[:PURCHASED]->(p) }
RETURN p.name, count(pref) AS match_score
ORDER BY match_score DESC;

2. Use Meaningful Names

-- GOOD: Clear, specific names
(person:Person)-[:WORKS_AT]->(company:Company)
(user:User)-[:PURCHASED]->(product:Product)

-- BAD: Generic, unclear names
(node1:Entity)-[:RELATED_TO]->(node2:Entity)

3. Normalize Carefully

Unlike relational databases, some denormalization is acceptable and beneficial in graphs.

-- OK to duplicate data for query performance
INSERT (order:Order {
  id: 'o123',
  customer_id: 'c456',
  customer_name: 'Alice',  -- Denormalized for quick access
  total: 99.99
});
INSERT (order)-[:PLACED_BY]->(customer:Customer {id: 'c456', name: 'Alice'});

4. Index Strategically

-- Index properties used for lookups
CREATE INDEX user_email ON User(email);
CREATE INDEX product_sku ON Product(sku);
CREATE INDEX order_date ON Order(order_date);

5. Avoid Over-Connecting

Don’t create relationships just because data is related conceptually.

-- BAD: Redundant relationship
INSERT (user)-[:LIVES_IN]->(city:City);
INSERT (city)-[:IN_STATE]->(state:State);
INSERT (user)-[:LIVES_IN_STATE]->(state);  -- Redundant! Can be derived

-- GOOD: Derive through traversal
MATCH (user:User)-[:LIVES_IN]->(city)-[:IN_STATE]->(state)
RETURN state.name;

Common Modeling Mistakes

1. Storing Lists in Properties Instead of Relationships

-- BAD: List property
INSERT (user:User {
  name: 'Alice',
  friend_ids: ['user_2', 'user_3', 'user_4']  -- Hard to query
});

-- GOOD: Relationships
INSERT (alice:User {name: 'Alice'});
INSERT (bob:User {name: 'Bob'});
INSERT (charlie:User {name: 'Charlie'});
INSERT (alice)-[:FRIENDS_WITH]->(bob);
INSERT (alice)-[:FRIENDS_WITH]->(charlie);

2. Using Node Properties Instead of Relationships

-- BAD: Foreign key style
INSERT (order:Order {
  id: 'o123',
  customer_id: 'c456'  -- Relational thinking
});

-- GOOD: Relationship
INSERT (order:Order {id: 'o123'});
INSERT (customer:Customer {id: 'c456'});
INSERT (order)-[:PLACED_BY]->(customer);

3. Single Super Label

-- BAD: Generic label with type property
INSERT (entity:Entity {type: 'person', name: 'Alice'});
INSERT (entity:Entity {type: 'company', name: 'Acme Corp'});

-- GOOD: Specific labels
INSERT (person:Person {name: 'Alice'});
INSERT (company:Company {name: 'Acme Corp'});

Model Validation

Geode supports schema constraints to ensure data quality:

-- Unique constraint
CREATE CONSTRAINT user_email_unique ON User(email) UNIQUE;

-- Existence constraint
CREATE CONSTRAINT user_email_required ON User REQUIRE email;

-- Type constraint (implicit in GQL)
-- Properties automatically validated against declared types

Conclusion

The property graph model provides a natural, intuitive way to model complex, interconnected data. By understanding nodes, relationships, properties, and common design patterns, you can create efficient, queryable graph schemas that closely mirror your domain and support your application’s query requirements.

Explore the documentation below for detailed examples of modeling specific domains, advanced patterns, and optimization techniques.


Related Articles