Fraud Detection | Tags | Geode Database

Fraud Detection with Geode

Fraud detection leverages graph relationships to uncover hidden connections between accounts, devices, transactions, and identities that traditional systems miss. Geode’s native graph processing excels at identifying fraud rings, money laundering networks, identity theft patterns, and coordinated attacks in real-time.

Why Graphs for Fraud Detection

Fraudsters operate through networks of relationships that are invisible in relational databases but obvious in graphs. A single suspicious account becomes a fraud ring when you discover it shares devices, IP addresses, payment methods, and phone numbers with dozens of other accounts.

Core Concepts

First-Party Fraud: Individuals committing fraud directly (identity theft, application fraud, synthetic identities).

Third-Party Fraud: Organized fraud rings coordinating attacks across multiple accounts and identities.

Fraud Rings: Networks of connected fraudulent accounts identified through shared attributes and behavioral patterns.

Velocity Checks: Detecting anomalous rates of activity (rapid account creation, transaction bursts, geographic impossibility).

Detecting Fraud Rings

Fraud rings are groups of accounts controlled by the same entity or organization, often sharing common attributes.

Shared Device Detection

-- Find accounts sharing devices (potential fraud ring)
MATCH (device:Device)<-[:USES]-(account1:Account),
      (device)<-[:USES]-(account2:Account)
WHERE account1 <> account2
  AND account1.created_date > CURRENT_DATE - INTERVAL '30' DAY
  AND account2.created_date > CURRENT_DATE - INTERVAL '30' DAY
WITH device,
     COLLECT(DISTINCT account1) + COLLECT(DISTINCT account2) AS accounts
WHERE SIZE(accounts) >= 5
RETURN device.fingerprint AS device_id,
       SIZE(accounts) AS account_count,
       [a IN accounts | a.email][..10] AS sample_accounts,
       MAX([a IN accounts | a.created_date]) AS most_recent_account
ORDER BY account_count DESC;

Multi-Hop Fraud Network

-- Expand fraud network through multiple relationship types
MATCH (seed:Account {flagged: true})
MATCH path = (seed)-[:USES|SHARES_IP|SAME_PHONE|PAYMENT_METHOD*1..3]-(connected:Account)
WHERE ALL(rel IN RELATIONSHIPS(path) WHERE TYPE(rel) IN ['USES', 'SHARES_IP', 'SAME_PHONE', 'PAYMENT_METHOD'])
WITH connected,
     COUNT(DISTINCT seed) AS flagged_connections,
     COLLECT(DISTINCT TYPE(RELATIONSHIPS(path)[0])) AS connection_types
WHERE flagged_connections >= 2
RETURN connected.id AS suspicious_account,
       connected.email AS email,
       flagged_connections AS risk_score,
       connection_types AS shared_attributes
ORDER BY risk_score DESC;

Synthetic Identity Detection

-- Detect synthetic identities (fabricated personas)
MATCH (account:Account)
WHERE account.created_date > CURRENT_DATE - INTERVAL '90' DAY
WITH account,
     COUNT { (account)-[:USES]->(device:Device) } AS device_count,
     COUNT { (account)-[:FROM_IP]->(ip:IPAddress) } AS ip_count,
     COUNT { (account)-[:HAS_PHONE]->(:Phone) } AS phone_count,
     COUNT { (account)-[:PAYMENT_METHOD]->(:CreditCard) } AS card_count
WHERE device_count > 10
   OR ip_count > 20
   OR phone_count > 3
   OR card_count > 5
MATCH (account)-[:SHARES_ATTRIBUTE]-(other:Account)
WHERE other.created_date > CURRENT_DATE - INTERVAL '90' DAY
WITH account,
     COUNT(DISTINCT other) AS linked_accounts,
     device_count,
     ip_count
WHERE linked_accounts > 15
RETURN account.id,
       account.email,
       linked_accounts,
       device_count,
       ip_count,
       (linked_accounts * 10 + device_count * 5 + ip_count * 2) AS synthetic_score
ORDER BY synthetic_score DESC;

Transaction Pattern Analysis

Rapid Fire Transactions

-- Detect suspiciously rapid transaction sequences
MATCH (account:Account)-[t:TRANSACTION]->(merchant:Merchant)
WHERE t.timestamp > CURRENT_TIMESTAMP - INTERVAL '1' HOUR
WITH account,
     COUNT(t) AS transaction_count,
     SUM(t.amount) AS total_amount,
     COLLECT(t.timestamp) AS timestamps
WHERE transaction_count > 20
WITH account,
     transaction_count,
     total_amount,
     timestamps,
     [i IN RANGE(1, SIZE(timestamps) - 1) |
      DURATION.BETWEEN(timestamps[i-1], timestamps[i]).seconds] AS intervals
WHERE ANY(interval IN intervals WHERE interval < 5)
RETURN account.id,
       transaction_count,
       total_amount,
       MIN(intervals) AS min_seconds_between,
       'RAPID_FIRE' AS fraud_type;

Geographic Impossibility

-- Detect transactions from impossible geographic locations
MATCH (account:Account)-[t1:TRANSACTION]->(m1:Merchant),
      (account)-[t2:TRANSACTION]->(m2:Merchant)
WHERE t2.timestamp > t1.timestamp
  AND t2.timestamp < t1.timestamp + INTERVAL '2' HOUR
  AND m1.location <> m2.location
WITH account,
     t1,
     t2,
     m1.location AS loc1,
     m2.location AS loc2,
     DURATION.BETWEEN(t1.timestamp, t2.timestamp).minutes AS time_diff,
     DISTANCE(m1.coordinates, m2.coordinates) AS distance_km
WHERE distance_km > 500
  AND distance_km / time_diff > 200  -- Speed > 200 km/h
RETURN account.id,
       t1.timestamp AS first_transaction,
       t2.timestamp AS second_transaction,
       loc1,
       loc2,
       distance_km,
       distance_km / time_diff AS implied_speed,
       'GEOGRAPHIC_IMPOSSIBILITY' AS fraud_indicator;

Money Laundering Patterns

-- Detect layering patterns (rapid movement through accounts)
MATCH path = (origin:Account)-[:TRANSFER*3..6]->(destination:Account)
WHERE origin <> destination
  AND ALL(rel IN RELATIONSHIPS(path)
          WHERE rel.timestamp > CURRENT_TIMESTAMP - INTERVAL '24' HOUR)
WITH origin,
     destination,
     path,
     [rel IN RELATIONSHIPS(path) | rel.amount] AS amounts,
     LENGTH(path) AS hops
WHERE ALL(amount IN amounts WHERE amount > 1000)
  AND hops >= 3
  AND REDUCE(total = 0, amount IN amounts | total + amount) > 10000
RETURN origin.id AS source_account,
       destination.id AS final_account,
       hops,
       REDUCE(total = 0, amount IN amounts | total + amount) AS total_transferred,
       [n IN NODES(path)[1..-1] | n.id] AS intermediary_accounts,
       'LAYERING' AS pattern_type
ORDER BY total_transferred DESC;

Real-Time Fraud Scoring

Composite Risk Score

-- Calculate real-time fraud risk score
MATCH (account:Account {id: $account_id})

// Device sharing risk
WITH account,
     COUNT { (account)-[:USES]->(d:Device)<-[:USES]-(:Account {flagged: true}) } AS flagged_device_connections

// IP sharing risk
WITH account, flagged_device_connections,
     COUNT { (account)-[:FROM_IP]->(ip:IPAddress)<-[:FROM_IP]-(:Account {flagged: true}) } AS flagged_ip_connections

// Velocity risk
WITH account, flagged_device_connections, flagged_ip_connections,
     COUNT {
         (account)-[t:TRANSACTION]->()
         WHERE t.timestamp > CURRENT_TIMESTAMP - INTERVAL '1' HOUR
     } AS recent_transactions

// Account age risk
WITH account, flagged_device_connections, flagged_ip_connections, recent_transactions,
     DURATION.BETWEEN(account.created_date, CURRENT_DATE).days AS account_age_days

// Network size risk
WITH account, flagged_device_connections, flagged_ip_connections,
     recent_transactions, account_age_days,
     COUNT { (account)-[:SHARES_ATTRIBUTE]-() } AS network_size

RETURN account.id,
       account.email,
       flagged_device_connections * 20 AS device_risk,
       flagged_ip_connections * 15 AS ip_risk,
       CASE
           WHEN recent_transactions > 50 THEN 50
           WHEN recent_transactions > 20 THEN 30
           WHEN recent_transactions > 10 THEN 15
           ELSE 0
       END AS velocity_risk,
       CASE
           WHEN account_age_days < 7 THEN 25
           WHEN account_age_days < 30 THEN 15
           WHEN account_age_days < 90 THEN 5
           ELSE 0
       END AS new_account_risk,
       CASE
           WHEN network_size > 100 THEN 30
           WHEN network_size > 50 THEN 20
           WHEN network_size > 20 THEN 10
           ELSE 0
       END AS network_risk,
       (flagged_device_connections * 20 +
        flagged_ip_connections * 15 +
        velocity_risk +
        new_account_risk +
        network_risk) AS total_fraud_score;

Behavioral Anomaly Detection

-- Detect deviations from normal behavior patterns
MATCH (account:Account {id: $account_id})

// Calculate historical baseline
WITH account,
     AVG([t IN [(account)-[trans:TRANSACTION]->() WHERE
                trans.timestamp > CURRENT_TIMESTAMP - INTERVAL '90' DAY
                AND trans.timestamp < CURRENT_TIMESTAMP - INTERVAL '7' DAY
                | trans.amount]]) AS avg_historical_amount,
     STDEV([t IN [(account)-[trans:TRANSACTION]->() WHERE
                  trans.timestamp > CURRENT_TIMESTAMP - INTERVAL '90' DAY
                  AND trans.timestamp < CURRENT_TIMESTAMP - INTERVAL '7' DAY
                  | trans.amount]]) AS stddev_historical_amount

// Compare recent transactions
MATCH (account)-[recent:TRANSACTION]->()
WHERE recent.timestamp > CURRENT_TIMESTAMP - INTERVAL '1' DAY
WITH account,
     avg_historical_amount,
     stddev_historical_amount,
     COLLECT(recent.amount) AS recent_amounts
WHERE SIZE(recent_amounts) > 0

WITH account,
     avg_historical_amount,
     stddev_historical_amount,
     recent_amounts,
     [amount IN recent_amounts WHERE
      ABS(amount - avg_historical_amount) > 3 * stddev_historical_amount
     ] AS anomalous_amounts

WHERE SIZE(anomalous_amounts) > 0

RETURN account.id,
       avg_historical_amount,
       anomalous_amounts,
       SIZE(anomalous_amounts) AS anomaly_count,
       'BEHAVIORAL_ANOMALY' AS alert_type;

Application Fraud Prevention

Account Takeover Detection

-- Detect potential account takeovers
MATCH (account:Account)-[login:LOGIN]->(device:Device)
WHERE login.timestamp > CURRENT_TIMESTAMP - INTERVAL '7' DAY
WITH account,
     COLLECT(DISTINCT device.fingerprint) AS recent_devices,
     COUNT { (account)-[:LOGIN {successful: true}]->() WHERE
             login.timestamp > CURRENT_TIMESTAMP - INTERVAL '90' DAY
             AND login.timestamp < CURRENT_TIMESTAMP - INTERVAL '7' DAY
     } AS historical_logins,
     COUNT { (account)-[:LOGIN {successful: true}]->() WHERE
             login.timestamp > CURRENT_TIMESTAMP - INTERVAL '7' DAY
     } AS recent_logins

// Check for new devices with high activity
WITH account, recent_devices, historical_logins, recent_logins
WHERE recent_logins > historical_logins * 3
   OR SIZE(recent_devices) > 5

MATCH (account)-[:LOGIN]->(new_device:Device)
WHERE NOT EXISTS {
    (account)-[:LOGIN]->new_device)
    WHERE login.timestamp < CURRENT_TIMESTAMP - INTERVAL '7' DAY
}
WITH account,
     COUNT(DISTINCT new_device) AS new_device_count,
     recent_logins

WHERE new_device_count >= 3

RETURN account.id,
       account.email,
       new_device_count,
       recent_logins,
       'ACCOUNT_TAKEOVER_SUSPECTED' AS alert;

Duplicate Account Detection

-- Find duplicate or related accounts
MATCH (a1:Account), (a2:Account)
WHERE ID(a1) < ID(a2)
  AND (
      a1.email = a2.email
      OR a1.phone = a2.phone
      OR LEVENSHTEIN(a1.name, a2.name) < 3
      OR a1.ssn_last4 = a2.ssn_last4
  )
WITH a1, a2,
     COUNT { (a1)-[:USES]->(:Device)<-[:USES]-(a2) } AS shared_devices,
     COUNT { (a1)-[:FROM_IP]->(:IPAddress)<-[:FROM_IP]-(a2) } AS shared_ips

WHERE shared_devices > 0 OR shared_ips > 0

RETURN a1.id AS account1,
       a2.id AS account2,
       a1.email AS email1,
       a2.email AS email2,
       shared_devices,
       shared_ips,
       'DUPLICATE_ACCOUNT' AS fraud_type;

Best Practices

Graph Model Design

Entity Resolution: Create nodes for shared attributes (devices, IPs, phones, payment methods) to make connections explicit and queryable.

Temporal Properties: Store timestamps on all relationships to enable time-based analysis and velocity checks.

Fraud Flags: Add boolean flags and timestamps when accounts/transactions are flagged for efficient filtering in subsequent queries.

Attribute Hashing: Hash sensitive identifiers (SSN, credit cards) while preserving ability to detect matches.

Performance Optimization

-- Index high-cardinality lookup fields
CREATE INDEX account_email_idx FOR (a:Account) ON (a.email);
CREATE INDEX device_fingerprint_idx FOR (d:Device) ON (d.fingerprint);
CREATE INDEX transaction_timestamp_idx FOR ()-[t:TRANSACTION]-() ON (t.timestamp);

-- Use query hints for large fraud ring detection
MATCH (device:Device)<-[:USES]-(account:Account)
USING INDEX account:Account(created_date)
WHERE account.created_date > CURRENT_DATE - INTERVAL '30' DAY
WITH device, COLLECT(account) AS accounts
WHERE SIZE(accounts) >= 5
RETURN device, accounts;

Real-Time Detection

Stream Processing: Process transactions as they occur, computing fraud scores before authorization.

Incremental Updates: Update fraud scores incrementally rather than recomputing entire graphs.

Caching: Cache frequently accessed fraud indicators (device reputations, IP risk scores).

Threshold Tuning: Adjust fraud score thresholds based on false positive rates and business impact.

Integration Examples

Python Client - Real-Time Scoring

from geode_client import Client

async def check_transaction_fraud(client, account_id, amount, device_id, ip_address):
    result, _ = await client.query("""
        MATCH (account:Account {id: $account_id})
        OPTIONAL MATCH (account)-[:USES]->(device:Device {id: $device_id})
        OPTIONAL MATCH (account)-[:FROM_IP]->(ip:IPAddress {address: $ip_address})

        WITH account,
             device,
             ip,
             COUNT { (device)<-[:USES]-(:Account {flagged: true}) } AS device_risk,
             COUNT { (ip)<-[:FROM_IP]-(:Account {flagged: true}) } AS ip_risk,
             COUNT {
                 (account)-[t:TRANSACTION]->()
                 WHERE t.timestamp > CURRENT_TIMESTAMP - INTERVAL '1' HOUR
             } AS recent_txn_count

        RETURN device_risk * 20 +
               ip_risk * 15 +
               CASE WHEN recent_txn_count > 10 THEN 30 ELSE 0 END +
               CASE WHEN $amount > account.avg_transaction * 5 THEN 25 ELSE 0 END
               AS fraud_score
    """, {
        'account_id': account_id,
        'device_id': device_id,
        'ip_address': ip_address,
        'amount': amount
    })

    fraud_score = result.rows[0][0]

    if fraud_score > 70:
        return {'decision': 'BLOCK', 'score': fraud_score}
    elif fraud_score > 40:
        return {'decision': 'REVIEW', 'score': fraud_score}
    else:
        return {'decision': 'APPROVE', 'score': fraud_score}

Rust Client - Fraud Ring Detection

use geode_client::Client;

async fn detect_fraud_ring(client: &Client, min_accounts: i32) -> Result<Vec<FraudRing>> {
    let results = client.execute(
        "MATCH (device:Device)<-[:USES]-(account:Account) \
         WHERE account.created_date > CURRENT_DATE - INTERVAL '30' DAY \
         WITH device, COLLECT(account) AS accounts \
         WHERE SIZE(accounts) >= $min_accounts \
         RETURN device.fingerprint AS device_id, \
                SIZE(accounts) AS account_count, \
                [a IN accounts | a.email] AS emails",
        &[("min_accounts", min_accounts.into())]
    ).await?;

    let mut fraud_rings = Vec::new();
    for row in results.rows() {
        fraud_rings.push(FraudRing {
            device_id: row.get_string(0)?,
            account_count: row.get_int(1)? as usize,
            emails: row.get_string_list(2)?,
        });
    }

    Ok(fraud_rings)
}

Anomaly Detection: See anomaly-detection tag for general anomaly detection patterns
Pattern Matching: Check patterns tag for complex graph pattern queries
Real-Time Analytics: Explore real-time-analytics for streaming fraud detection
Security: Review security tag for authentication and access control integration

Advanced Fraud Detection Patterns

Money Laundering Detection (Layering & Integration)

Detect complex money laundering schemes:

-- Detect smurfing (structuring to avoid reporting thresholds)
MATCH (a:Account)-[t:TRANSACTION]->()
WHERE t.timestamp > datetime().minusDays(7)
  AND t.amount BETWEEN 9000 AND 9999  // Just below $10k reporting threshold
WITH a,
     COUNT(t) AS near_threshold_count,
     SUM(t.amount) AS total_amount,
     COUNT(DISTINCT date.truncate('day', t.timestamp)) AS days_active
WHERE near_threshold_count >= 5
  AND total_amount > 40000
RETURN a.account_id,
       near_threshold_count,
       total_amount,
       days_active,
       'STRUCTURING' AS fraud_pattern;

-- Fan-out / Fan-in patterns (integration phase)
MATCH (origin:Account)-[t1:TRANSFER]->(layer1:Account)
      -[t2:TRANSFER]->(layer2:Account)
WHERE t1.timestamp > datetime().minusDays(3)
  AND t2.timestamp > t1.timestamp
  AND t2.timestamp < t1.timestamp + duration('PT2H')
WITH origin, layer2, COUNT(DISTINCT layer1) AS intermediaries
WHERE intermediaries >= 3
MATCH (layer2)-[t3:TRANSFER]->(destination:Account)
WHERE t3.timestamp > datetime().minusDays(3)
WITH origin, destination,
     COUNT(DISTINCT layer2) AS layering_nodes,
     SUM(t3.amount) AS final_amount
WHERE layering_nodes >= 2 AND final_amount > 50000
RETURN origin.account_id AS source,
       destination.account_id AS sink,
       layering_nodes,
       final_amount,
       'LAYERING_NETWORK' AS fraud_type;

Identity Verification Fraud

Detect fake identities and account takeovers:

-- Synthetic identity detection
MATCH (a:Account)
WHERE a.created_date > datetime().minusDays(180)
WITH a,
     COUNT { (a)-[:HAS_PHONE]->(:Phone) } AS phone_count,
     COUNT { (a)-[:HAS_EMAIL]->(:Email) } AS email_count,
     COUNT { (a)-[:USES]->(:Device) } AS device_count,
     COUNT { (a)-[:FROM_IP]->(:IPAddress) } AS ip_count,
     SIZE((a)-[:SHARES_SSN]->()) AS ssn_matches
WHERE phone_count > 3
   OR device_count > 15
   OR ip_count > 25
   OR ssn_matches > 5
WITH a, phone_count, device_count, ip_count, ssn_matches,
     phone_count * 2 + device_count * 3 + ip_count + ssn_matches * 10 AS synthetic_score
WHERE synthetic_score > 30
RETURN a.account_id,
       a.name,
       phone_count,
       device_count,
       ip_count,
       ssn_matches,
       synthetic_score
ORDER BY synthetic_score DESC;

-- Account takeover detection
MATCH (a:Account)-[login:LOGIN]->(device:Device)
WHERE login.timestamp > datetime().minusDays(30)
WITH a,
     COLLECT(DISTINCT device.fingerprint) AS recent_devices,
     COUNT { (a)-[:LOGIN]->() WHERE login.timestamp < datetime().minusDays(30) } AS historical_logins
WITH a, recent_devices,
     SIZE([d IN recent_devices WHERE NOT EXISTS {
         (a)-[:LOGIN]->(:Device {fingerprint: d})
         WHERE login.timestamp < datetime().minusDays(30)
     }]) AS new_device_count
WHERE new_device_count >= 3
MATCH (a)-[recent_login:LOGIN]->(:Device)
WHERE recent_login.timestamp > datetime().minusDays(7)
  AND recent_login.failed_attempts > 3
WITH a, new_device_count, COUNT(recent_login) AS failed_login_attempts
WHERE failed_login_attempts > 10
RETURN a.account_id,
       new_device_count,
       failed_login_attempts,
       'ACCOUNT_TAKEOVER_SUSPECTED' AS alert;

Card-Not-Present (CNP) Fraud

Detect online payment fraud:

-- Card testing patterns (small transactions before large purchase)
MATCH (card:CreditCard)<-[t:TRANSACTION]-(merchant:Merchant)
WHERE t.timestamp > datetime().minusHours(24)
WITH card,
     COUNT(t) AS tx_count,
     [t IN COLLECT(t) ORDER BY t.timestamp | t.amount] AS amounts
WITH card, tx_count, amounts,
     [a IN amounts WHERE a < 5.0] AS small_amounts,
     [a IN amounts WHERE a > 500.0] AS large_amounts
WHERE SIZE(small_amounts) >= 5
  AND SIZE(large_amounts) >= 1
  AND amounts[-1] > 500  // Latest transaction is large
RETURN card.card_number_last4,
       tx_count,
       SIZE(small_amounts) AS test_transactions,
       amounts[-1] AS large_purchase_amount,
       'CARD_TESTING' AS fraud_indicator;

-- Cross-border fraud velocity
MATCH (card:CreditCard)<-[t:TRANSACTION]-(merchant:Merchant)
WHERE t.timestamp > datetime().minusDays(7)
WITH card, merchant.country AS country, COUNT(t) AS tx_in_country
WITH card, COUNT(DISTINCT country) AS countries_used, SUM(tx_in_country) AS total_tx
WHERE countries_used >= 5 AND total_tx >= 10
MATCH (card)<-[recent:TRANSACTION]-(m:Merchant)
WHERE recent.timestamp > datetime().minusHours(12)
WITH card, countries_used,
     COLLECT(DISTINCT m.country) AS recent_countries
WHERE SIZE(recent_countries) >= 3
RETURN card.card_number_last4,
       countries_used,
       recent_countries,
       'GEOGRAPHIC_VELOCITY' AS fraud_type;

Machine Learning-Based Fraud Detection

Feature Engineering for ML Models

-- Extract fraud detection features
MATCH (a:Account)
OPTIONAL MATCH (a)-[t:TRANSACTION]->()
WHERE t.timestamp > datetime().minusDays(90)
WITH a,
     COUNT(t) AS tx_count_90d,
     AVG(t.amount) AS avg_tx_amount,
     STDDEV(t.amount) AS stddev_tx_amount,
     MAX(t.amount) AS max_tx_amount,
     COUNT(DISTINCT date.truncate('day', t.timestamp)) AS active_days,
     COUNT(DISTINCT t.merchant_id) AS unique_merchants
WITH a, tx_count_90d, avg_tx_amount, stddev_tx_amount, max_tx_amount, active_days, unique_merchants,
     SIZE((a)-[:USES]->(:Device)) AS device_count,
     SIZE((a)-[:FROM_IP]->(:IPAddress)) AS ip_count
OPTIONAL MATCH (a)-[:TRANSACTED_WITH]-(other:Account {flagged: true})
WITH a, tx_count_90d, avg_tx_amount, stddev_tx_amount, max_tx_amount,
     active_days, unique_merchants, device_count, ip_count,
     COUNT(DISTINCT other) AS flagged_connections
RETURN a.account_id,
       tx_count_90d,
       avg_tx_amount,
       stddev_tx_amount,
       max_tx_amount,
       active_days,
       unique_merchants,
       device_count,
       ip_count,
       flagged_connections,
       tx_count_90d * 1.0 / NULLIF(active_days, 0) AS tx_per_active_day,
       max_tx_amount / NULLIF(avg_tx_amount, 0) AS amount_volatility,
       flagged_connections * 100.0 / NULLIF(device_count + ip_count, 0) AS network_risk_ratio;

Graph Neural Network (GNN) Fraud Detection

-- Store GNN embeddings computed externally
MATCH (a:Account {account_id: $account_id})
SET a.gnn_embedding = $embedding_vector;

-- Query using learned representations
MATCH (suspicious:Account)
WHERE suspicious.gnn_fraud_score > 0.8  // Score from external GNN model
MATCH (suspicious)-[:TRANSACTED_WITH*1..2]-(connected:Account)
WITH connected,
     AVG(suspicious.gnn_fraud_score) AS avg_neighbor_fraud_score,
     COUNT(DISTINCT suspicious) AS suspicious_connections
WHERE avg_neighbor_fraud_score > 0.6
  AND suspicious_connections >= 3
RETURN connected.account_id,
       avg_neighbor_fraud_score,
       suspicious_connections,
       'GUILT_BY_ASSOCIATION' AS detection_method;

Rule-Based Expert Systems

Configurable Fraud Rules Engine

-- Define fraud rules as data
CREATE (rule:FraudRule {
    rule_id: 'RULE_001',
    name: 'High Velocity Small Transactions',
    condition: 'tx_count > 20 AND max_amount < 100 AND time_window_hours < 1',
    score: 0.7,
    action: 'BLOCK',
    enabled: true
});

CREATE (rule:FraudRule {
    rule_id: 'RULE_002',
    name: 'Geographic Impossibility',
    condition: 'distance_km > 500 AND time_diff_minutes < 60',
    score: 0.9,
    action: 'BLOCK',
    enabled: true
});

-- Execute rules engine
MATCH (tx:Transaction {transaction_id: $tx_id})
MATCH (rule:FraudRule {enabled: true})
CALL fraud.evaluate_rule(tx, rule)
YIELD matched, score
WHERE matched = true
WITH tx, SUM(score) AS total_fraud_score,
     COLLECT(rule.name) AS triggered_rules
WITH tx, total_fraud_score, triggered_rules,
     CASE
         WHEN total_fraud_score > 0.9 THEN 'BLOCK'
         WHEN total_fraud_score > 0.6 THEN 'REVIEW'
         WHEN total_fraud_score > 0.3 THEN 'FLAG'
         ELSE 'APPROVE'
     END AS decision
RETURN tx.transaction_id,
       decision,
       total_fraud_score,
       triggered_rules;

Production Deployment Patterns

Real-Time Fraud Scoring Pipeline

-- Inline transaction scoring (< 50ms latency requirement)
MATCH (tx:PendingTransaction {transaction_id: $tx_id})
MATCH (account:Account {account_id: tx.account_id})

// Parallel risk checks
CALL {
    WITH account
    // Check 1: Account velocity
    MATCH (account)-[recent:TRANSACTION]->()
    WHERE recent.timestamp > datetime().minusHours(1)
    WITH COUNT(recent) AS recent_count
    RETURN CASE WHEN recent_count > 10 THEN 0.3 ELSE 0.0 END AS velocity_risk
}
CALL {
    WITH account
    // Check 2: Device reputation
    MATCH (account)-[:USES]->(device:Device)
    WITH device.reputation_score AS device_rep
    RETURN CASE WHEN device_rep < 0.5 THEN 0.4 ELSE 0.0 END AS device_risk
}
CALL {
    WITH account, tx
    // Check 3: Amount anomaly
    MATCH (account)-[hist:TRANSACTION]->()
    WHERE hist.timestamp > datetime().minusDays(30)
    WITH AVG(hist.amount) AS avg_amount, STDDEV(hist.amount) AS stddev_amount
    WITH ABS(tx.amount - avg_amount) / NULLIF(stddev_amount, 0) AS z_score
    RETURN CASE WHEN z_score > 3 THEN 0.3 ELSE 0.0 END AS amount_risk
}

WITH tx,
     velocity_risk + device_risk + amount_risk AS fraud_score
SET tx.fraud_score = fraud_score,
    tx.decision = CASE
        WHEN fraud_score > 0.8 THEN 'BLOCK'
        WHEN fraud_score > 0.5 THEN 'REVIEW'
        ELSE 'APPROVE'
    END,
    tx.scored_at = datetime()
RETURN tx.transaction_id, tx.decision, tx.fraud_score;

Batch Fraud Sweep (Daily/Weekly)

-- Comprehensive fraud analysis for all accounts
MATCH (a:Account)
WHERE a.last_fraud_check < datetime().minusDays(1)
CALL {
    WITH a
    // Run full suite of checks
    CALL fraud.check_network_anomalies(a) YIELD network_score
    CALL fraud.check_behavioral_patterns(a) YIELD behavior_score
    CALL fraud.check_transaction_patterns(a) YIELD transaction_score
    
    WITH network_score + behavior_score + transaction_score AS total_score
    WHERE total_score > 0.6
    
    SET a.fraud_risk_score = total_score,
        a.last_fraud_check = datetime(),
        a.requires_review = true
    
    RETURN a.account_id, total_score
} IN TRANSACTIONS OF 10000 ROWS

RETURN COUNT(*) AS accounts_flagged;

Feedback Loop and Model Retraining

-- Collect feedback from fraud analysts
MATCH (tx:Transaction {flagged: true})
MATCH (review:FraudReview {transaction_id: tx.transaction_id})
WHERE review.analyst_decision IS NOT NULL
WITH tx,
     review.analyst_decision AS true_label,  // 'FRAUD' or 'LEGITIMATE'
     tx.fraud_score AS predicted_score
WITH COLLECT({
    transaction_id: tx.transaction_id,
    features: tx.features,
    predicted_score: predicted_score,
    true_label: true_label
}) AS training_data
// Export for model retraining
RETURN training_data;

-- Track model performance metrics
WITH [review IN fraud_reviews WHERE review.created_at > datetime().minusDays(30)] AS reviews
WITH SIZE([r IN reviews WHERE r.predicted = 'FRAUD' AND r.actual = 'FRAUD']) AS true_positives,
     SIZE([r IN reviews WHERE r.predicted = 'FRAUD' AND r.actual = 'LEGITIMATE']) AS false_positives,
     SIZE([r IN reviews WHERE r.predicted = 'LEGITIMATE' AND r.actual = 'FRAUD']) AS false_negatives,
     SIZE([r IN reviews WHERE r.predicted = 'LEGITIMATE' AND r.actual = 'LEGITIMATE']) AS true_negatives
RETURN true_positives * 1.0 / (true_positives + false_positives) AS precision,
       true_positives * 1.0 / (true_positives + false_negatives) AS recall,
       2 * precision * recall / (precision + recall) AS f1_score;

Compliance and Regulatory Requirements

AML/KYC Transaction Monitoring

-- Monitor for suspicious activity reports (SARs)
MATCH (a:Account)-[t:TRANSACTION]->()
WHERE t.timestamp > datetime().minusDays(30)
WITH a,
     SUM(CASE WHEN t.amount >= 10000 THEN 1 ELSE 0 END) AS large_tx_count,
     SUM(CASE WHEN t.amount BETWEEN 9000 AND 9999 THEN 1 ELSE 0 END) AS structuring_count,
     COUNT(DISTINCT t.destination_country) AS international_countries,
     SUM(t.amount) AS total_volume
WHERE large_tx_count >= 5
   OR structuring_count >= 3
   OR international_countries >= 10
   OR total_volume >= 100000
WITH a, large_tx_count, structuring_count, international_countries, total_volume,
     CASE
         WHEN structuring_count >= 3 THEN 'STRUCTURING'
         WHEN large_tx_count >= 10 THEN 'HIGH_VOLUME'
         WHEN international_countries >= 15 THEN 'INTERNATIONAL_SUSPICIOUS'
         ELSE 'REVIEW_REQUIRED'
     END AS sar_reason
CREATE (sar:SuspiciousActivityReport {
    sar_id: uuid(),
    account_id: a.account_id,
    reason: sar_reason,
    large_tx_count: large_tx_count,
    structuring_count: structuring_count,
    international_countries: international_countries,
    total_volume: total_volume,
    created_at: datetime(),
    status: 'PENDING_REVIEW'
});

Best Practices

Multi-Layered Defense: Combine rule-based, statistical, and ML approaches
Real-Time + Batch: Inline scoring for authorization, deep analysis offline
Low Latency: Target < 50ms for real-time fraud checks
Explainability: Always provide reason codes for fraud decisions
Feedback Loops: Collect analyst decisions to improve models
False Positive Management: Balance security with user experience
Privacy: Anonymize data in analytics, log access to sensitive information
Monitoring: Track precision, recall, F1-score, and false positive rates
Adaptive Thresholds: Adjust sensitivity based on risk appetite and trends
Cross-Channel: Detect fraud across web, mobile, API, and physical channels

Fraud Detection with Geode Share link

Why Graphs for Fraud Detection Share link

Core Concepts Share link

Detecting Fraud Rings Share link

Shared Device Detection Share link

Multi-Hop Fraud Network Share link

Synthetic Identity Detection Share link

Transaction Pattern Analysis Share link

Rapid Fire Transactions Share link

Geographic Impossibility Share link

Money Laundering Patterns Share link

Real-Time Fraud Scoring Share link

Composite Risk Score Share link

Behavioral Anomaly Detection Share link

Application Fraud Prevention Share link

Account Takeover Detection Share link

Duplicate Account Detection Share link

Best Practices Share link

Graph Model Design Share link

Performance Optimization Share link

Real-Time Detection Share link

Integration Examples Share link

Python Client - Real-Time Scoring Share link

Rust Client - Fraud Ring Detection Share link

Related Topics Share link

Advanced Fraud Detection Patterns Share link

Money Laundering Detection (Layering & Integration) Share link

Identity Verification Fraud Share link

Card-Not-Present (CNP) Fraud Share link

Machine Learning-Based Fraud Detection Share link

Feature Engineering for ML Models Share link

Graph Neural Network (GNN) Fraud Detection Share link

Rule-Based Expert Systems Share link

Configurable Fraud Rules Engine Share link

Production Deployment Patterns Share link

Real-Time Fraud Scoring Pipeline Share link

Batch Fraud Sweep (Daily/Weekly) Share link

Feedback Loop and Model Retraining Share link

Compliance and Regulatory Requirements Share link

AML/KYC Transaction Monitoring Share link

Best Practices Share link

Further Reading Share link

Related Articles

Fraud and Anomaly Detection

Fraud Detection Guide

Fraud Detection with Geode

Why Graphs for Fraud Detection

Core Concepts

Detecting Fraud Rings

Shared Device Detection

Multi-Hop Fraud Network

Synthetic Identity Detection

Transaction Pattern Analysis

Rapid Fire Transactions

Geographic Impossibility

Money Laundering Patterns

Real-Time Fraud Scoring

Composite Risk Score

Behavioral Anomaly Detection

Application Fraud Prevention

Account Takeover Detection

Duplicate Account Detection

Best Practices

Graph Model Design

Performance Optimization

Real-Time Detection

Integration Examples

Python Client - Real-Time Scoring

Rust Client - Fraud Ring Detection

Related Topics

Advanced Fraud Detection Patterns

Money Laundering Detection (Layering & Integration)

Identity Verification Fraud

Card-Not-Present (CNP) Fraud

Machine Learning-Based Fraud Detection

Feature Engineering for ML Models

Graph Neural Network (GNN) Fraud Detection

Rule-Based Expert Systems

Configurable Fraud Rules Engine

Production Deployment Patterns

Real-Time Fraud Scoring Pipeline

Batch Fraud Sweep (Daily/Weekly)

Feedback Loop and Model Retraining

Compliance and Regulatory Requirements

AML/KYC Transaction Monitoring

Best Practices

Further Reading