Horizontal Scalability
Horizontal scalability is the ability to increase system capacity by adding more nodes to a distributed system rather than upgrading individual machines. Geode’s architecture is purpose-built for horizontal scaling, enabling organizations to grow from single-node deployments to distributed clusters targeting 100M+ nodes and 1B+ relationships in load-test models. This comprehensive guide explores scaling strategies, architectural patterns, configuration options, and operational best practices for building large-scale graph databases.
Understanding Horizontal Scalability
Horizontal scalability (scale-out) differs fundamentally from vertical scalability (scale-up):
Vertical Scaling: Increase capacity by adding CPU, RAM, or storage to a single machine. Limited by hardware constraints and expensive at high end.
Horizontal Scaling: Increase capacity by adding more machines to the cluster. Theoretically unlimited and more cost-effective using commodity hardware.
For graph databases, horizontal scaling presents unique challenges due to data relationships that may span multiple nodes. Geode addresses these challenges through intelligent partitioning, distributed query execution, and optimized inter-node communication.
Scaling Dimensions
Throughput Scaling
Increase query processing capacity by distributing workload:
scalability:
throughput:
# Add nodes to increase query capacity
target_qps: 100000
# Distribute queries across nodes
query_distribution:
strategy: "least_loaded"
health_aware: true
# Connection pooling per node
connection_pool:
size_per_node: 1000
overflow: 500
# Parallel query execution
parallelism:
max_workers_per_query: 16
query_queue_size: 10000
Metrics:
- Queries per second (QPS)
- Transactions per second (TPS)
- Concurrent connections
- Request latency (p50, p95, p99)
Storage Scaling
Expand data capacity by adding storage nodes:
scalability:
storage:
# Partition data across nodes
partitioning:
strategy: "consistent_hash"
partitions: 64
rebalance_threshold: 0.2
# Storage tiers for hot/warm/cold data
tiers:
hot:
node_count: 8
storage_type: "nvme_ssd"
data_age: "7d"
warm:
node_count: 16
storage_type: "ssd"
data_age: "30d"
cold:
node_count: 32
storage_type: "hdd"
data_age: "365d"
Metrics:
- Total storage capacity
- Data distribution balance
- Storage utilization per node
- IOPS per tier
Compute Scaling
Add processing power for complex analytics:
scalability:
compute:
# Dedicated nodes for graph algorithms
analytics_nodes: 8
# CPU allocation
cpu_allocation:
oltp_nodes: 16 # cores per OLTP node
analytics_nodes: 64 # cores per analytics node
# Memory allocation
memory_allocation:
cache_per_node_gb: 128
query_buffer_gb: 32
Metrics:
- CPU utilization
- Memory utilization
- Query complexity (operators, traversals)
- Algorithm execution time
Scaling Architectures
Shared-Nothing Architecture
Each node operates independently with local storage:
architecture:
type: "shared_nothing"
nodes:
# Each node is self-sufficient
- id: "node1"
cpu_cores: 32
memory_gb: 256
storage_tb: 10
role: "data"
- id: "node2"
cpu_cores: 32
memory_gb: 256
storage_tb: 10
role: "data"
# Data partitioning
partitioning:
enabled: true
strategy: "hash"
# No shared storage
shared_storage: false
Advantages:
- Linear scalability
- No single point of contention
- Independent failure domains
- Cost-effective with commodity hardware
Challenges:
- Cross-partition queries require coordination
- Data rebalancing on node addition/removal
- Distributed transaction complexity
Separation of Compute and Storage
Decouple processing from storage for independent scaling:
architecture:
type: "disaggregated"
# Compute layer (stateless)
compute_layer:
nodes: 20
auto_scale:
enabled: true
min_nodes: 5
max_nodes: 100
target_cpu: 70
# Storage layer (stateful)
storage_layer:
nodes: 50
replication_factor: 3
storage_type: "distributed_object_store"
# Shared storage backend
shared_storage:
type: "s3_compatible"
cache_tier: "local_nvme"
Advantages:
- Independent compute and storage scaling
- Stateless compute nodes (easy to add/remove)
- Elastic scaling based on workload
- Cost optimization (scale what you need)
Challenges:
- Network bandwidth becomes critical
- Higher latency for storage access
- Cache management complexity
Tiered Architecture
Organize nodes into specialized tiers:
architecture:
type: "tiered"
tiers:
# Tier 1: Query routers (stateless)
query_tier:
nodes: 10
role: "coordinator"
capabilities:
- query_parsing
- query_routing
- result_aggregation
# Tier 2: Caching layer
cache_tier:
nodes: 20
role: "cache"
memory_gb: 512
cache_strategy: "lru"
# Tier 3: Primary data nodes
data_tier:
nodes: 50
role: "data"
storage_tb: 10
# Tier 4: Analytics nodes
analytics_tier:
nodes: 10
role: "analytics"
cpu_cores: 128
memory_gb: 1024
Advantages:
- Workload isolation
- Optimized resource allocation per tier
- Independent tier scaling
- Specialized hardware per workload
Challenges:
- Operational complexity
- Increased latency (multiple hops)
- Cost of additional infrastructure
Elastic Scaling
Auto-Scaling Policies
Configure automatic cluster scaling:
auto_scaling:
enabled: true
# Scale triggers
triggers:
# CPU-based scaling
- metric: "cpu_utilization"
threshold_high: 75
threshold_low: 25
action_scale_out: "add_2_nodes"
action_scale_in: "remove_1_node"
cooldown_minutes: 10
# Memory-based scaling
- metric: "memory_utilization"
threshold_high: 80
action_scale_out: "add_4_nodes"
cooldown_minutes: 15
# Query queue depth
- metric: "query_queue_depth"
threshold_high: 5000
action_scale_out: "add_5_nodes"
cooldown_minutes: 5
# Storage capacity
- metric: "storage_utilization"
threshold_high: 70
action_scale_out: "add_storage_nodes"
node_count: 3
# Scaling limits
limits:
min_nodes: 3
max_nodes: 200
max_scale_out_per_hour: 20
max_scale_in_per_hour: 10
# Node provisioning
provisioning:
cloud_provider: "aws"
instance_type: "r6i.4xlarge"
availability_zones: ["us-east-1a", "us-east-1b", "us-east-1c"]
Dynamic Resource Allocation
Adjust resources based on workload patterns:
dynamic_allocation:
# Workload classification
workload_detection:
enabled: true
classification_interval_seconds: 60
# Resource profiles
profiles:
oltp_heavy:
memory_allocation: 60
cpu_allocation: 70
connection_pool_size: 2000
analytics_heavy:
memory_allocation: 80
cpu_allocation: 90
connection_pool_size: 500
balanced:
memory_allocation: 70
cpu_allocation: 80
connection_pool_size: 1000
# Automatic profile selection
auto_profile:
enabled: true
switch_threshold: 0.3
switch_cooldown_minutes: 5
Load Balancing
Query Load Balancing
Distribute queries across cluster nodes:
load_balancing:
query_distribution:
# Load balancing algorithm
algorithm: "weighted_least_connections"
# Health-aware routing
health_checks:
enabled: true
interval_seconds: 5
unhealthy_threshold: 3
# Node weights
weights:
node1: 100
node2: 100
node3: 50 # Reduced capacity
# Sticky sessions
sticky_sessions:
enabled: true
duration_minutes: 30
# Circuit breaker
circuit_breaker:
enabled: true
error_threshold: 50 # percent
timeout_seconds: 60
Data Distribution
Balance data across nodes:
load_balancing:
data_distribution:
# Automatic rebalancing
rebalancing:
enabled: true
trigger_imbalance: 0.2 # 20% variance
schedule: "0 2 * * SUN" # Weekly at 2 AM
# Hotspot detection
hotspot_detection:
enabled: true
threshold_qps: 10000
mitigation: "replicate_hot_data"
# Partition migration
migration:
max_concurrent: 2
bandwidth_limit_mbps: 1000
verification: true
Capacity Planning
Growth Modeling
Project future capacity requirements (replace values with your own benchmarks; numbers below are illustrative):
# Capacity planning model
capacity_planning:
# Current metrics
current_state:
nodes: 20
total_storage_tb: 100
peak_qps: 50000
data_growth_rate_monthly: 0.15 # 15%
# Projections
projections:
# 6-month forecast
6_months:
estimated_nodes: 32
estimated_storage_tb: 195
estimated_peak_qps: 80000
# 12-month forecast
12_months:
estimated_nodes: 50
estimated_storage_tb: 378
estimated_peak_qps: 100000
# Scaling thresholds
thresholds:
storage_trigger: 0.7 # Add nodes at 70% capacity
qps_trigger: 0.8 # Scale out at 80% capacity
safety_margin: 0.3 # 30% overhead
Benchmark-Based Sizing
Determine cluster size from performance requirements (use measured per-node capacity from your benchmarks):
sizing:
requirements:
# Performance targets
target_qps: 100000
target_p95_latency_ms: 50
target_storage_tb: 500
# Benchmark data (per node)
node_capacity:
qps: 5000
p95_latency_ms: 20
storage_tb: 10
# Calculated cluster size
calculated:
min_nodes_for_qps: 20 # 100000 / 5000
min_nodes_for_storage: 50 # 500 / 10
recommended_nodes: 60 # max(20, 50) * 1.2 safety factor
Performance Optimization for Scale
Distributed Query Optimization
Optimize queries for distributed execution:
-- Poor: Requires full cluster scan
MATCH (u:User)
WHERE u.age > 25
RETURN u.name
LIMIT 100;
-- Better: Partition-aware with index
MATCH (u:User)
WHERE u.country = 'USA' AND u.age > 25
RETURN u.name
LIMIT 100;
-- Best: Localized to single partition
MATCH (u:User {id: $user_id})-[:FRIENDS_WITH]->(f:User)
RETURN f.name
LIMIT 100;
Batch Processing
Process multiple operations efficiently:
batch_processing:
# Batch writes
write_batching:
enabled: true
batch_size: 1000
max_delay_ms: 100
# Batch reads
read_batching:
enabled: true
batch_size: 500
max_delay_ms: 50
# Parallel batch execution
parallelism:
max_concurrent_batches: 16
workers_per_batch: 4
Caching Strategy
Implement multi-tier caching:
caching:
# L1: Local node cache
l1_cache:
enabled: true
size_mb: 4096
eviction: "lru"
ttl_seconds: 300
# L2: Distributed cache
l2_cache:
enabled: true
type: "redis_cluster"
nodes: 10
size_per_node_gb: 64
ttl_seconds: 3600
# Query result cache
result_cache:
enabled: true
size_mb: 2048
cache_key: "query_hash"
ttl_seconds: 600
Monitoring Scalability
Scalability Metrics
Track metrics that indicate scaling needs:
monitoring:
scalability_metrics:
# Resource utilization
- name: "node_cpu_utilization"
threshold_warning: 70
threshold_critical: 85
- name: "node_memory_utilization"
threshold_warning: 75
threshold_critical: 90
- name: "storage_utilization"
threshold_warning: 70
threshold_critical: 85
# Performance metrics
- name: "query_latency_p95_ms"
threshold_warning: 100
threshold_critical: 500
- name: "queries_per_second"
type: "gauge"
# Scalability indicators
- name: "cluster_efficiency"
formula: "actual_qps / (node_count * ideal_qps_per_node)"
threshold_warning: 0.7
- name: "data_balance_coefficient"
formula: "stddev(partition_sizes) / mean(partition_sizes)"
threshold_warning: 0.3
Capacity Alerts
Configure alerts for scaling decisions:
alerts:
- name: "NeedScaleOut"
condition: "cpu_utilization > 75 AND qps > 80% of capacity"
action: "trigger_auto_scale_out"
- name: "NeedScaleIn"
condition: "cpu_utilization < 25 AND qps < 40% of capacity FOR 30m"
action: "trigger_auto_scale_in"
- name: "StorageCapacity"
condition: "storage_utilization > 70%"
action: "add_storage_nodes"
- name: "PerformanceDegradation"
condition: "p95_latency > 2x baseline FOR 10m"
action: "investigate_and_scale"
Best Practices
Scaling Strategy
Start Small, Scale Gradually: Begin with minimal cluster size and add nodes based on actual demand
Monitor Continuously: Track utilization, latency, and throughput to identify scaling triggers
Plan for Peaks: Size cluster for peak load plus 30% headroom
Test at Scale: Conduct load testing at target scale before production
Automate Scaling: Implement auto-scaling policies to respond to demand changes
Data Distribution
- Use consistent hashing for uniform distribution
- Implement partition rebalancing during low-traffic periods
- Monitor partition imbalance and rebalance at 20% variance
- Co-locate frequently accessed data
- Use read replicas for read-heavy workloads
Operational Guidelines
- Document scaling procedures and runbooks
- Implement gradual rollout for cluster changes
- Test failover at scale regularly
- Maintain node homogeneity for predictable performance
- Use infrastructure-as-code for reproducible deployments
Troubleshooting
Common Scalability Issues
Linear Scaling Failure: Adding nodes doesn’t improve performance proportionally.
Solution: Identify bottlenecks (network, coordination overhead, hot partitions), optimize cross-partition queries, rebalance data.
Resource Imbalance: Some nodes overloaded while others idle.
Solution: Rebalance partitions, adjust load balancing weights, identify and redistribute hot data.
Coordination Overhead: Too much inter-node communication.
Solution: Optimize partitioning strategy, increase partition locality, cache cross-partition data.
Network Saturation: Network becomes bottleneck.
Solution: Upgrade network infrastructure, optimize data serialization, reduce replication traffic, compress inter-node communication.
Performance Analysis
# Analyze cluster balance
geode cluster analyze-balance \
--show-distribution \
--show-hotspots
# Measure scaling efficiency
geode cluster scaling-efficiency \
--baseline-nodes=10 \
--current-nodes=50
# Identify bottlenecks
geode cluster bottleneck-analysis \
--duration=1h \
--report=bottlenecks.json
# Simulate scaling
geode cluster simulate-scaling \
--target-nodes=100 \
--workload=production_sample.log
Related Topics
- Clustering - Database clustering strategies
- Sharding - Data sharding and distribution
- Replication - Data replication for availability
- Performance - Query performance optimization
- Distributed Systems - Distributed systems architecture
Further Reading
- Performance and Scaling - Scalability architecture guide
- Distributed Architecture - Distributed systems overview
- Server Configuration - Configuration options
- Production Deployment - Production best practices
- Query Performance Tuning - Optimization techniques