Implementing robust automated backup strategies is critical for production Geode deployments. This guide covers automated backup configuration, scheduling, testing, and disaster recovery procedures.
Overview
Geode supports multiple backup strategies:
- Full Backups: Complete database snapshot
- Incremental Backups: Changes since last full backup
- Point-in-Time Recovery (PITR): Restore to specific timestamp
- Cloud Storage: S3-compatible object storage integration
- Automated Scheduling: Cron-based backup automation
Recovery Time Objective (RTO): < 5 minutes for full restore Recovery Point Objective (RPO): < 15 minutes with incremental backups
S3 Cloud Backup Configuration
Prerequisites
S3-Compatible Storage:
- Amazon S3
- Digital Ocean Spaces
- MinIO
- Wasabi
- Backblaze B2
Required Credentials:
- Access Key ID
- Secret Access Key
- Region (for AWS)
- Endpoint URL (for non-AWS providers)
Environment Setup
# AWS S3
export AWS_ACCESS_KEY_ID='AKIAIOSFODNN7EXAMPLE'
export AWS_SECRET_ACCESS_KEY='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
export AWS_REGION='us-east-1'
# Digital Ocean Spaces
export AWS_ACCESS_KEY_ID='DO_SPACES_KEY'
export AWS_SECRET_ACCESS_KEY='DO_SPACES_SECRET'
export AWS_ENDPOINT_URL='https://nyc3.digitaloceanspaces.com'
export AWS_REGION='us-east-1' # Required but not used by DO
# MinIO
export AWS_ACCESS_KEY_ID='minio_access_key'
export AWS_SECRET_ACCESS_KEY='minio_secret_key'
export AWS_ENDPOINT_URL='https://minio.example.com:9000'
export AWS_REGION='us-east-1'
Server Configuration
# geode.yaml
backup:
s3:
enabled: true
bucket: 'geode-production-backups'
prefix: 'prod' # Optional: organize by environment
region: 'us-east-1'
endpoint: '' # Leave empty for AWS S3
access_key_id: '${AWS_ACCESS_KEY_ID}'
secret_access_key: '${AWS_SECRET_ACCESS_KEY}'
compression: true # gzip compression
encryption: true # Server-side encryption
retention_days: 90 # Auto-delete old backups
schedule:
enabled: true
full_backup: '0 2 * * 0' # Weekly on Sunday at 2 AM
incremental_backup: '0 2 * * 1-6' # Daily at 2 AM except Sunday
Manual Backup Operations
Full Backup
# Create full backup
geode backup \
--dest s3://geode-backups/production \
--mode full \
--compression gzip
# Output:
# Backup started: backup-20260123-020000
# Backup ID: 1738012345
# Compressing data...
# Uploading to S3: s3://geode-backups/production/backup-1738012345.tar.gz
# Backup completed successfully
# Size: 2.3 GB (compressed from 5.1 GB)
# Duration: 45s
Incremental Backup
# Create incremental backup (delta since last full)
geode backup \
--dest s3://geode-backups/production \
--mode incremental \
--parent 1738012345
# Output:
# Incremental backup started
# Parent backup: 1738012345 (2026-01-23 02:00:00)
# Backup ID: 1738098745
# Changes: 156 MB
# Duration: 8s
List Backups
# List all backups
geode backup \
--dest s3://geode-backups/production \
--list
# Output:
# Backup ID Type Size Timestamp Status
# 1738012345 full 2.3 GB 2026-01-23 02:00:00 complete
# 1738098745 incremental 156 MB 2026-01-24 02:00:00 complete
# 1738185145 incremental 89 MB 2026-01-25 02:00:00 complete
# 1738271545 incremental 134 MB 2026-01-26 02:00:00 complete
Verify Backup Integrity
# Verify backup without restoring
geode backup \
--dest s3://geode-backups/production \
--verify \
--backup-id 1738012345
# Output:
# Verifying backup 1738012345...
# Downloading metadata...
# Checking file integrity (SHA256)...
# ✓ data/nodes.db: OK
# ✓ data/edges.db: OK
# ✓ data/indexes/: OK
# ✓ wal/: OK
# Backup integrity: VALID
Automated Backup Scripts
Backup Script
#!/bin/bash
# /usr/local/bin/geode-backup.sh
set -euo pipefail
# Configuration
BUCKET="s3://geode-production-backups"
RETENTION_DAYS=90
LOG_FILE="/var/log/geode/backup.log"
ALERT_EMAIL="[email protected]"
# Logging function
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG_FILE"
}
# Error handler
handle_error() {
log "ERROR: Backup failed at line $1"
echo "Geode backup failed. Check $LOG_FILE for details." | \
mail -s "Geode Backup Failure" "$ALERT_EMAIL"
exit 1
}
trap 'handle_error $LINENO' ERR
log "Starting backup"
# Determine backup type (full on Sunday, incremental otherwise)
DOW=$(date +%u)
if [ "$DOW" -eq 7 ]; then
log "Performing full backup (Sunday)"
BACKUP_ID=$(geode backup \
--dest "$BUCKET" \
--mode full \
--compression gzip | \
grep "Backup ID" | \
awk '{print $3}')
echo "$BACKUP_ID" > /var/lib/geode/last-full-backup
log "Full backup completed: $BACKUP_ID"
else
log "Performing incremental backup"
PARENT=$(cat /var/lib/geode/last-full-backup)
BACKUP_ID=$(geode backup \
--dest "$BUCKET" \
--mode incremental \
--parent "$PARENT" \
--compression gzip | \
grep "Backup ID" | \
awk '{print $3}')
log "Incremental backup completed: $BACKUP_ID"
fi
# Verify backup integrity
log "Verifying backup integrity"
geode backup \
--dest "$BUCKET" \
--verify \
--backup-id "$BACKUP_ID" >> "$LOG_FILE" 2>&1
# Prune old backups
log "Pruning backups older than $RETENTION_DAYS days"
geode backup \
--dest "$BUCKET" \
--prune \
--older-than-days "$RETENTION_DAYS" >> "$LOG_FILE" 2>&1
log "Backup completed successfully"
# Send success notification
echo "Backup completed successfully. ID: $BACKUP_ID" | \
mail -s "Geode Backup Success" "$ALERT_EMAIL"
Crontab Configuration
# Install backup script
sudo cp geode-backup.sh /usr/local/bin/
sudo chmod +x /usr/local/bin/geode-backup.sh
# Add to crontab
sudo crontab -e
# Daily backups at 2 AM
0 2 * * * /usr/local/bin/geode-backup.sh >> /var/log/geode/backup-cron.log 2>&1
Systemd Timer (Alternative to Cron)
# /etc/systemd/system/geode-backup.service
[Unit]
Description=Geode Automated Backup
Wants=geode-backup.timer
[Service]
Type=oneshot
User=geode
Group=geode
ExecStart=/usr/local/bin/geode-backup.sh
StandardOutput=append:/var/log/geode/backup.log
StandardError=append:/var/log/geode/backup.log
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/geode-backup.timer
[Unit]
Description=Geode Backup Timer
Requires=geode-backup.service
[Timer]
# Daily at 2 AM
OnCalendar=daily
OnCalendar=*-*-* 02:00:00
Persistent=true
[Install]
WantedBy=timers.target
# Enable timer
sudo systemctl daemon-reload
sudo systemctl enable geode-backup.timer
sudo systemctl start geode-backup.timer
# Check status
sudo systemctl status geode-backup.timer
sudo systemctl list-timers geode-backup.timer
Restore Procedures
Full Restore
# Stop Geode server
sudo systemctl stop geode
# Backup current data (safety)
sudo mv /var/lib/geode/data /var/lib/geode/data.backup-$(date +%Y%m%d)
# Restore from backup
geode restore \
--source s3://geode-backups/production \
--backup-id 1738012345 \
--target /var/lib/geode/data
# Verify data integrity
geode verify --data-dir /var/lib/geode/data
# Start server
sudo systemctl start geode
# Verify server health
geode query "RETURN 1 AS health_check"
Point-in-Time Recovery (PITR)
# Restore to specific timestamp
geode restore \
--source s3://geode-backups/production \
--backup-id 1738012345 \
--target /var/lib/geode/data \
--pitr-timestamp "2026-01-23 10:30:00"
# Process:
# 1. Restore base backup (full backup 1738012345)
# 2. Apply WAL segments up to specified timestamp
# 3. Stop at exact recovery point
#
# Output:
# Restoring base backup...
# Applying WAL segments...
# - wal/segment-001.log (2026-01-23 02:05:00 - 03:00:00)
# - wal/segment-002.log (2026-01-23 03:00:00 - 04:00:00)
# ...
# - wal/segment-009.log (2026-01-23 10:00:00 - 10:35:12)
# Stopping at 2026-01-23 10:30:00
# Recovery complete
Automated Restore Script
#!/bin/bash
# /usr/local/bin/geode-restore.sh
set -euo pipefail
BUCKET="$1"
BACKUP_ID="$2"
TARGET="${3:-/var/lib/geode/data}"
PITR_TIMESTAMP="${4:-}"
echo "Stopping Geode server..."
sudo systemctl stop geode
echo "Creating safety backup of current data..."
if [ -d "$TARGET" ]; then
sudo mv "$TARGET" "${TARGET}.before-restore-$(date +%Y%m%d-%H%M%S)"
fi
echo "Restoring from backup $BACKUP_ID..."
if [ -n "$PITR_TIMESTAMP" ]; then
geode restore \
--source "$BUCKET" \
--backup-id "$BACKUP_ID" \
--target "$TARGET" \
--pitr-timestamp "$PITR_TIMESTAMP"
else
geode restore \
--source "$BUCKET" \
--backup-id "$BACKUP_ID" \
--target "$TARGET"
fi
echo "Verifying data integrity..."
geode verify --data-dir "$TARGET"
echo "Starting Geode server..."
sudo systemctl start geode
# Wait for server to start
sleep 5
echo "Verifying server health..."
geode query "RETURN 1 AS health_check"
echo "Restore completed successfully"
# Usage:
# ./geode-restore.sh s3://geode-backups/production 1738012345
# ./geode-restore.sh s3://geode-backups/production 1738012345 /var/lib/geode/data "2026-01-23 10:30:00"
Disaster Recovery Testing
Monthly DR Test
#!/bin/bash
# /usr/local/bin/geode-dr-test.sh
set -euo pipefail
BUCKET="s3://geode-backups/production"
TEST_DIR="/tmp/geode-dr-test-$(date +%Y%m%d)"
REPORT_FILE="/var/log/geode/dr-test-$(date +%Y%m%d).log"
log() {
echo "[$(date +'%Y-%m-%d %H:%M:%S')] $*" | tee -a "$REPORT_FILE"
}
log "=== Disaster Recovery Test Started ==="
# Get latest backup
LATEST_BACKUP=$(geode backup --dest "$BUCKET" --list | \
grep "full" | \
head -1 | \
awk '{print $1}')
log "Testing restore of backup: $LATEST_BACKUP"
# Create test directory
mkdir -p "$TEST_DIR"
# Restore to test directory
log "Restoring backup..."
START_TIME=$(date +%s)
geode restore \
--source "$BUCKET" \
--backup-id "$LATEST_BACKUP" \
--target "$TEST_DIR" >> "$REPORT_FILE" 2>&1
END_TIME=$(date +%s)
RESTORE_DURATION=$((END_TIME - START_TIME))
log "Restore completed in ${RESTORE_DURATION}s (RTO target: 300s)"
# Verify data integrity
log "Verifying data integrity..."
geode verify --data-dir "$TEST_DIR" >> "$REPORT_FILE" 2>&1
# Start test server
log "Starting test server..."
geode serve \
--data-dir "$TEST_DIR" \
--listen 127.0.0.1:3142 &
SERVER_PID=$!
sleep 5
# Run validation queries
log "Running validation queries..."
QUERY_COUNT=$(geode query "MATCH (n) RETURN count(n)" --server 127.0.0.1:3142 | \
jq -r '.result.rows[0].count')
log "Node count: $QUERY_COUNT"
# Stop test server
kill $SERVER_PID
# Cleanup
rm -rf "$TEST_DIR"
# Generate report
log "=== Disaster Recovery Test Summary ==="
log "Backup ID: $LATEST_BACKUP"
log "Restore Duration: ${RESTORE_DURATION}s (RTO: 300s)"
log "RTO Status: $([ $RESTORE_DURATION -lt 300 ] && echo 'PASS' || echo 'FAIL')"
log "Data Integrity: VERIFIED"
log "Node Count: $QUERY_COUNT"
log "Test Status: SUCCESS"
# Schedule monthly
# 0 3 1 * * /usr/local/bin/geode-dr-test.sh
Monitoring and Alerting
Backup Monitoring Script
#!/bin/bash
# /usr/local/bin/geode-backup-monitor.sh
BUCKET="s3://geode-backups/production"
ALERT_EMAIL="[email protected]"
MAX_AGE_HOURS=26 # Alert if no backup in 26 hours
# Get latest backup timestamp
LATEST=$(geode backup --dest "$BUCKET" --list | \
head -2 | \
tail -1 | \
awk '{print $4" "$5}')
LATEST_EPOCH=$(date -d "$LATEST" +%s)
NOW_EPOCH=$(date +%s)
AGE_HOURS=$(( (NOW_EPOCH - LATEST_EPOCH) / 3600 ))
if [ $AGE_HOURS -gt $MAX_AGE_HOURS ]; then
echo "WARNING: Last backup is ${AGE_HOURS} hours old (max: ${MAX_AGE_HOURS})" | \
mail -s "Geode Backup Alert" "$ALERT_EMAIL"
exit 1
fi
echo "OK: Last backup is ${AGE_HOURS} hours old"
exit 0
# Run hourly
# 0 * * * * /usr/local/bin/geode-backup-monitor.sh
Prometheus Metrics
# prometheus.yml
scrape_configs:
- job_name: 'geode-backups'
static_configs:
- targets: ['localhost:9090']
metrics_path: '/metrics'
# Exposed metrics:
# geode_backup_last_success_timestamp
# geode_backup_duration_seconds
# geode_backup_size_bytes
# geode_backup_age_hours
Alert Rules
# alerts.yml
groups:
- name: geode_backups
rules:
- alert: BackupTooOld
expr: geode_backup_age_hours > 26
for: 1h
labels:
severity: critical
annotations:
summary: "Geode backup is too old"
description: "Last backup is {{ $value }} hours old (max: 26)"
- alert: BackupFailed
expr: increase(geode_backup_failures_total[1h]) > 0
labels:
severity: critical
annotations:
summary: "Geode backup failed"
description: "{{ $value }} backup failures in last hour"
- alert: BackupSlowDuration
expr: geode_backup_duration_seconds > 600
labels:
severity: warning
annotations:
summary: "Geode backup duration exceeded threshold"
description: "Backup took {{ $value }}s (max: 600s)"
Best Practices
Backup Strategy
3-2-1 Rule:
- 3 copies of data (production + 2 backups)
- 2 different storage types (local + cloud)
- 1 offsite copy (different region/provider)
Backup Schedule:
Full backup: Weekly (Sunday 2 AM)
Incremental backups: Daily (Monday-Saturday 2 AM)
WAL archival: Continuous (every 5 minutes)
Retention: 90 days
Testing Requirements
Monthly DR Tests:
- Full restore to test environment
- Verify data integrity
- Measure RTO (should be < 5 minutes)
- Test PITR accuracy
Quarterly Full DR Drills:
- Complete failover simulation
- Document recovery procedures
- Update runbooks
- Train operations team
Security
Encryption:
backup:
s3:
encryption: true # Server-side encryption (AES-256)
sse_kms_key_id: 'arn:aws:kms:us-east-1:123456789:key/...' # Optional KMS
Access Control:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::123456789:user/geode-backup"},
"Action": ["s3:PutObject", "s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::geode-backups/*",
"arn:aws:s3:::geode-backups"
]
}
]
}
Performance Optimization
Compression:
- Use gzip for general workloads (good compression, fast)
- Use zstd for better compression (requires more CPU)
- Disable for pre-compressed data
Parallel Uploads:
# Configure parallel uploads
export AWS_MAX_CONCURRENT_REQUESTS=10
export AWS_MAX_BANDWIDTH=100MB/s
Incremental Strategy:
- Reduces backup time by 80-90%
- Lower network bandwidth usage
- Faster recovery for recent data
Related Documentation
- Migration Guide - Backup and restore procedures
- Server Configuration - Backup configuration options
- Troubleshooting Guide - Backup troubleshooting
- Disaster Recovery - Complete DR procedures
Summary
Automated backup implementation checklist:
✅ Configure S3 credentials - Set up cloud storage access ✅ Enable automated backups - Schedule full + incremental backups ✅ Test restore procedures - Verify backups work (monthly) ✅ Implement monitoring - Alert on backup failures ✅ Document procedures - Maintain DR runbooks ✅ Practice DR drills - Regular failover testing
Key Metrics:
- RTO: < 5 minutes (full restore)
- RPO: < 15 minutes (with incremental backups)
- Backup frequency: Daily
- Retention: 90 days
- Test frequency: Monthly
Automated backups ensure business continuity and data protection for production Geode deployments.