<!-- CANARY: REQ=REQ-DOCS-001; FEATURE="Docs"; ASPECT=Documentation; STATUS=TESTED; OWNER=docs; UPDATED=2026-01-28 --> <h2 id="disaster-recovery" class="position-relative d-flex align-items-center group"> <span>Disaster Recovery</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="disaster-recovery" aria-haspopup="dialog" aria-label="Share link: Disaster Recovery"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h2><div id="headingShareModal" class="heading-share-modal" role="dialog" aria-modal="true" aria-labelledby="headingShareTitle" hidden> <div class="hsm-dialog" role="document"> <div class="hsm-header"> <h2 id="headingShareTitle" class="h6 mb-0 fw-bold">Share this section</h2> <button type="button" class="hsm-close" aria-label="Close"> <i class="fa-solid fa-xmark"></i> </button> </div> <div class="hsm-body"> <label for="headingShareInput" class="form-label small text-muted mb-1 text-uppercase fw-bold" style="font-size: 0.7rem; letter-spacing: 0.5px;">Permalink</label> <div class="input-group mb-4 hsm-url-group"> <input id="headingShareInput" type="text" class="form-control font-monospace" readonly aria-readonly="true" style="font-size: 0.85rem;" /> <button class="btn btn-primary hsm-copy" type="button" aria-label="Copy" title="Copy"> <i class="fa-duotone fa-clipboard" aria-hidden="true"></i> </button> </div> <div class="small fw-bold mb-2 text-muted text-uppercase" style="font-size: 0.7rem; letter-spacing: 0.5px;">Share via</div> <div class="hsm-share-grid"> <a id="share-twitter" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer"> <i class="fa-brands fa-twitter me-2"></i>Twitter </a> <a id="share-linkedin" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer"> <i class="fa-brands fa-linkedin me-2"></i>LinkedIn </a> <a id="share-facebook" class="btn btn-outline-secondary w-100" target="_blank" rel="noopener noreferrer"> <i class="fa-brands fa-facebook me-2"></i>Facebook </a> </div> </div> </div> </div> <style> .heading-share-modal { position: fixed; inset: 0; display: flex; justify-content: center; align-items: center; background: rgba(0, 0, 0, 0.6); z-index: 1050; padding: 1rem; backdrop-filter: blur(4px); -webkit-backdrop-filter: blur(4px); } .heading-share-modal[hidden] { display: none !important; } .hsm-dialog { max-width: 420px; width: 100%; background: var(--bs-body-bg, #fff); color: var(--bs-body-color, #212529); border: 1px solid var(--bs-border-color, rgba(0,0,0,0.1)); border-radius: 1rem; box-shadow: 0 25px 50px -12px rgba(0, 0, 0, 0.25); overflow: hidden; animation: hsm-fade-in 0.2s ease-out; } @keyframes hsm-fade-in { from { opacity: 0; transform: scale(0.95); } to { opacity: 1; transform: scale(1); } } [data-bs-theme="dark"] .hsm-dialog { background: #1e293b; border-color: rgba(255,255,255,0.1); color: #f8f9fa; } .hsm-header { display: flex; justify-content: space-between; align-items: center; padding: 1rem 1.5rem; border-bottom: 1px solid var(--bs-border-color, rgba(0,0,0,0.1)); background: rgba(0,0,0,0.02); } [data-bs-theme="dark"] .hsm-header { background: rgba(255,255,255,0.02); border-color: rgba(255,255,255,0.1); } .hsm-close { background: transparent; border: none; color: inherit; opacity: 0.5; padding: 0.25rem 0.5rem; border-radius: 0.25rem; font-size: 1.2rem; line-height: 1; transition: opacity 0.2s; } .hsm-close:hover { opacity: 1; } .hsm-body { padding: 1.5rem; } .hsm-url-group { display: flex !important; align-items: stretch; } .hsm-url-group .form-control { flex: 1; min-width: 0; margin: 0; background: var(--bs-secondary-bg, #f8f9fa); border-color: var(--bs-border-color, #dee2e6); border-top-right-radius: 0; border-bottom-right-radius: 0; height: 42px; } .hsm-url-group .btn { flex: 0 0 auto; margin: 0; margin-left: -1px; border-top-left-radius: 0; border-bottom-left-radius: 0; height: 42px; display: flex; align-items: center; justify-content: center; padding: 0 1.25rem; z-index: 2; } [data-bs-theme="dark"] .hsm-url-group .form-control { background: #0f172a; border-color: #334155; color: #e2e8f0; } .hsm-share-grid { display: flex; flex-direction: column; gap: 0.5rem; } .hsm-share-grid .btn { display: flex; align-items: center; justify-content: center; font-size: 0.9rem; padding: 0.6rem; border-color: var(--bs-border-color); width: 100%; } [data-bs-theme="dark"] .hsm-share-grid .btn { color: #e2e8f0; border-color: #475569; } [data-bs-theme="dark"] .hsm-share-grid .btn:hover { background: #334155; border-color: #cbd5e1; } </style> <script> (function(){ const modal = document.getElementById('headingShareModal'); if(!modal) return; const input = modal.querySelector('#headingShareInput'); const copyBtn = modal.querySelector('.hsm-copy'); const twitter = modal.querySelector('#share-twitter'); const linkedin = modal.querySelector('#share-linkedin'); const facebook = modal.querySelector('#share-facebook'); const closeBtn = modal.querySelector('.hsm-close'); let lastFocus=null; let trapBound=false; function buildUrl(id){ return window.location.origin + window.location.pathname + '#' + id; } function isOpen(){ return !modal.hasAttribute('hidden'); } function hydrate(id){ const url=buildUrl(id); input.value=url; const enc=encodeURIComponent(url); const text=encodeURIComponent(document.title); if(twitter) twitter.href=`https://twitter.com/intent/tweet?url=${enc}&text=${text}`; if(linkedin) linkedin.href=`https://www.linkedin.com/sharing/share-offsite/?url=${enc}`; if(facebook) facebook.href=`https://www.facebook.com/sharer/sharer.php?u=${enc}`; } function openModal(id){ lastFocus=document.activeElement; hydrate(id); if(!isOpen()){ modal.removeAttribute('hidden'); } requestAnimationFrame(()=>{ input.focus(); }); trapFocus(); } function closeModal(){ if(!isOpen()) return; modal.setAttribute('hidden',''); if(lastFocus && typeof lastFocus.focus==='function') lastFocus.focus(); } function copyCurrent(){ try{ navigator.clipboard.writeText(input.value).then(()=>feedback(true),()=>fallback()); } catch(e){ fallback(); } } function fallback(){ input.select(); try{ document.execCommand('copy'); feedback(true);}catch(e){ feedback(false);} } function feedback(ok){ if(!copyBtn) return; const icon=copyBtn.querySelector('i'); if(!icon) return; const prev=copyBtn.getAttribute('data-prev')||icon.className; if(!copyBtn.getAttribute('data-prev')) copyBtn.setAttribute('data-prev',prev); icon.className= ok ? 'fa-duotone fa-clipboard-check':'fa-duotone fa-circle-exclamation'; setTimeout(()=>{ icon.className=prev; },1800); } function handleShareClick(e){ e.preventDefault(); const btn=e.currentTarget; const id=btn.getAttribute('data-share-target'); if(id) openModal(id); } function bindShareButtons(){ document.querySelectorAll('.h-share').forEach(btn=>{ if(!btn.dataset.hShareBound){ btn.addEventListener('click', handleShareClick); btn.dataset.hShareBound='1'; } }); } bindShareButtons(); if(document.readyState==='loading'){ document.addEventListener('DOMContentLoaded', bindShareButtons); } else { requestAnimationFrame(bindShareButtons); } document.addEventListener('click', function(e){ const shareBtn=e.target.closest && e.target.closest('.h-share'); if(shareBtn && !shareBtn.dataset.hShareBound){ handleShareClick.call(shareBtn, e); } }, true); document.addEventListener('click', e=>{ if(e.target===modal) closeModal(); if(e.target.closest && e.target.closest('.hsm-close')){ e.preventDefault(); closeModal(); } if(copyBtn && (e.target===copyBtn || (e.target.closest && e.target.closest('.hsm-copy')))) { e.preventDefault(); copyCurrent(); } }); document.addEventListener('keydown', e=>{ if(e.key==='Escape' && isOpen()) closeModal(); }); function trapFocus(){ if(trapBound) return; trapBound=true; modal.addEventListener('keydown', f=>{ if(f.key==='Tab' && isOpen()){ const focusable=[...modal.querySelectorAll('a[href],button,input,textarea,select,[tabindex]:not([tabindex="-1"])')].filter(el=>!el.hasAttribute('disabled')); if(!focusable.length) return; const first=focusable[0]; const last=focusable[focusable.length-1]; if(f.shiftKey && document.activeElement===first){ f.preventDefault(); last.focus(); } else if(!f.shiftKey && document.activeElement===last){ f.preventDefault(); first.focus(); } } }); } if(closeBtn) closeBtn.addEventListener('click', e=>{ e.preventDefault(); closeModal(); }); })(); </script><p>This guide covers disaster recovery (DR) planning and procedures for Geode, including RTO/RPO objectives, failover strategies, and business continuity planning.</p> <h3 id="overview" class="position-relative d-flex align-items-center group"> <span>Overview</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="overview" aria-haspopup="dialog" aria-label="Share link: Overview"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3><p>Disaster recovery ensures business continuity when failures occur:</p> <table> <thead> <tr> <th>Scenario</th> <th>Impact</th> <th>Recovery Strategy</th> </tr> </thead> <tbody> <tr> <td>Server crash</td> <td>Single node unavailable</td> <td>Automatic restart, replica failover</td> </tr> <tr> <td>Data center outage</td> <td>Full DC unavailable</td> <td>Cross-DC failover</td> </tr> <tr> <td>Data corruption</td> <td>Data integrity compromised</td> <td>Point-in-time recovery</td> </tr> <tr> <td>Ransomware</td> <td>Data encrypted/lost</td> <td>Offline backup restore</td> </tr> <tr> <td>Region failure</td> <td>Cloud region unavailable</td> <td>Multi-region failover</td> </tr> </tbody> </table> <h3 id="recovery-objectives" class="position-relative d-flex align-items-center group"> <span>Recovery Objectives</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="recovery-objectives" aria-haspopup="dialog" aria-label="Share link: Recovery Objectives"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3> <h4 id="rto-recovery-time-objective" class="position-relative d-flex align-items-center group"> <span>RTO (Recovery Time Objective)</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="rto-recovery-time-objective" aria-haspopup="dialog" aria-label="Share link: RTO (Recovery Time Objective)"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>Maximum acceptable downtime:</p> <table> <thead> <tr> <th>Tier</th> <th>RTO</th> <th>Use Case</th> </tr> </thead> <tbody> <tr> <td>Tier 1</td> <td>&lt; 1 minute</td> <td>Real-time, financial</td> </tr> <tr> <td>Tier 2</td> <td>&lt; 15 minutes</td> <td>Production critical</td> </tr> <tr> <td>Tier 3</td> <td>&lt; 4 hours</td> <td>Standard production</td> </tr> <tr> <td>Tier 4</td> <td>&lt; 24 hours</td> <td>Non-critical</td> </tr> </tbody> </table> <h4 id="rpo-recovery-point-objective" class="position-relative d-flex align-items-center group"> <span>RPO (Recovery Point Objective)</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="rpo-recovery-point-objective" aria-haspopup="dialog" aria-label="Share link: RPO (Recovery Point Objective)"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>Maximum acceptable data loss:</p> <table> <thead> <tr> <th>Tier</th> <th>RPO</th> <th>Method</th> </tr> </thead> <tbody> <tr> <td>Zero</td> <td>0</td> <td>Synchronous replication</td> </tr> <tr> <td>Near-zero</td> <td>&lt; 1 minute</td> <td>Async replication + WAL</td> </tr> <tr> <td>Standard</td> <td>&lt; 15 minutes</td> <td>Incremental backups</td> </tr> <tr> <td>Extended</td> <td>&lt; 24 hours</td> <td>Daily backups</td> </tr> </tbody> </table> <h4 id="geode-capabilities" class="position-relative d-flex align-items-center group"> <span>Geode Capabilities</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="geode-capabilities" aria-haspopup="dialog" aria-label="Share link: Geode Capabilities"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><table> <thead> <tr> <th>Feature</th> <th>RTO</th> <th>RPO</th> </tr> </thead> <tbody> <tr> <td>Automatic restart</td> <td>&lt; 30s</td> <td>0</td> </tr> <tr> <td>Replica failover</td> <td>&lt; 1 min</td> <td>&lt; 1s</td> </tr> <tr> <td>PITR (Point-in-Time)</td> <td>&lt; 5 min</td> <td>&lt; 5 min</td> </tr> <tr> <td>Backup restore</td> <td>&lt; 30 min</td> <td>&lt; 24h</td> </tr> </tbody> </table> <h3 id="dr-architecture-patterns" class="position-relative d-flex align-items-center group"> <span>DR Architecture Patterns</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="dr-architecture-patterns" aria-haspopup="dialog" aria-label="Share link: DR Architecture Patterns"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3> <h4 id="single-region-ha" class="position-relative d-flex align-items-center group"> <span>Single-Region HA</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="single-region-ha" aria-haspopup="dialog" aria-label="Share link: Single-Region HA"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>High availability within a single region:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl"> ┌─────────────────┐ </span></span><span class="line"><span class="cl"> │ Load Balancer │ </span></span><span class="line"><span class="cl"> │ (Active) │ </span></span><span class="line"><span class="cl"> └────────┬────────┘ </span></span><span class="line"><span class="cl"> │ </span></span><span class="line"><span class="cl"> ┌────────────────────┼────────────────────┐ </span></span><span class="line"><span class="cl"> │ │ │ </span></span><span class="line"><span class="cl"> ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ </span></span><span class="line"><span class="cl"> │ Geode 1 │◄────────►│ Geode 2 │◄────────►│ Geode 3 │ </span></span><span class="line"><span class="cl"> │(Primary)│ Sync │(Replica)│ Sync │(Replica)│ </span></span><span class="line"><span class="cl"> └────┬────┘ Repl └────┬────┘ Repl └────┬────┘ </span></span><span class="line"><span class="cl"> │ │ │ </span></span><span class="line"><span class="cl"> ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ </span></span><span class="line"><span class="cl"> │ Zone A │ │ Zone B │ │ Zone C │ </span></span><span class="line"><span class="cl"> └─────────┘ └─────────┘ └─────────┘ </span></span></code></pre></div><p><strong>Characteristics</strong>:</p> <ul> <li>RTO: &lt; 1 minute</li> <li>RPO: &lt; 1 second (sync replication)</li> <li>Protects against: Server failure, zone failure</li> </ul> <h4 id="multi-region-active-passive" class="position-relative d-flex align-items-center group"> <span>Multi-Region Active-Passive</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="multi-region-active-passive" aria-haspopup="dialog" aria-label="Share link: Multi-Region Active-Passive"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>Cross-region disaster recovery:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">┌─────────────────────────────────────┐ </span></span><span class="line"><span class="cl">│ Primary Region │ </span></span><span class="line"><span class="cl">│ │ </span></span><span class="line"><span class="cl">│ ┌─────────┐ ┌─────────┐ │ </span></span><span class="line"><span class="cl">│ │ Geode 1 │──│ Geode 2 │──┐ │ </span></span><span class="line"><span class="cl">│ │(Primary)│ │(Replica)│ │ │ </span></span><span class="line"><span class="cl">│ └────┬────┘ └─────────┘ │ │ </span></span><span class="line"><span class="cl">│ │ │ │ </span></span><span class="line"><span class="cl">│ ▼ │ │ </span></span><span class="line"><span class="cl">│ ┌─────────────────────────┤ │ </span></span><span class="line"><span class="cl">│ │ Async Replication │ │ </span></span><span class="line"><span class="cl">│ └─────────────────────────┘ │ </span></span><span class="line"><span class="cl">└─────────────────┬───────────────────┘ </span></span><span class="line"><span class="cl"> │ Async </span></span><span class="line"><span class="cl"> ▼ </span></span><span class="line"><span class="cl">┌─────────────────────────────────────┐ </span></span><span class="line"><span class="cl">│ DR Region │ </span></span><span class="line"><span class="cl">│ │ </span></span><span class="line"><span class="cl">│ ┌─────────┐ ┌─────────┐ │ </span></span><span class="line"><span class="cl">│ │ Geode 1 │──│ Geode 2 │ │ </span></span><span class="line"><span class="cl">│ │(Standby)│ │(Standby)│ │ </span></span><span class="line"><span class="cl">│ └─────────┘ └─────────┘ │ </span></span><span class="line"><span class="cl">│ │ </span></span><span class="line"><span class="cl">└─────────────────────────────────────┘ </span></span></code></pre></div><p><strong>Characteristics</strong>:</p> <ul> <li>RTO: 15-60 minutes (manual failover)</li> <li>RPO: &lt; 5 minutes (async replication)</li> <li>Protects against: Region failure, DC failure</li> </ul> <h4 id="multi-region-active-active" class="position-relative d-flex align-items-center group"> <span>Multi-Region Active-Active</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="multi-region-active-active" aria-haspopup="dialog" aria-label="Share link: Multi-Region Active-Active"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>Global deployment with bidirectional replication:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl">┌─────────────────────────┐ ┌─────────────────────────┐ </span></span><span class="line"><span class="cl">│ Region US-East │ │ Region EU-West │ </span></span><span class="line"><span class="cl">│ │ │ │ </span></span><span class="line"><span class="cl">│ ┌─────────┐ │ │ ┌─────────┐ │ </span></span><span class="line"><span class="cl">│ │ Geode │◄─────────────────────────────│ Geode │ │ </span></span><span class="line"><span class="cl">│ │ Cluster │ Bidirectional│ │Replication│ Cluster │ │ </span></span><span class="line"><span class="cl">│ └────┬────┘ │ │ └────┬────┘ │ </span></span><span class="line"><span class="cl">│ │ │ │ │ │ </span></span><span class="line"><span class="cl">│ ┌────▼────┐ │ │ ┌────▼────┐ │ </span></span><span class="line"><span class="cl">│ │ Users │ │ │ │ Users │ │ </span></span><span class="line"><span class="cl">│ │ US/LATAM│ │ │ │ EMEA │ │ </span></span><span class="line"><span class="cl">│ └─────────┘ │ │ └─────────┘ │ </span></span><span class="line"><span class="cl">└─────────────────────────┘ └─────────────────────────┘ </span></span></code></pre></div><p><strong>Characteristics</strong>:</p> <ul> <li>RTO: 0 (automatic)</li> <li>RPO: Conflict resolution dependent</li> <li>Protects against: Regional failures</li> <li>Note: Requires conflict resolution strategy</li> </ul> <h3 id="configuration" class="position-relative d-flex align-items-center group"> <span>Configuration</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="configuration" aria-haspopup="dialog" aria-label="Share link: Configuration"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3> <h4 id="replication-setup" class="position-relative d-flex align-items-center group"> <span>Replication Setup</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="replication-setup" aria-haspopup="dialog" aria-label="Share link: Replication Setup"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># geode.yaml - Primary</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">replication</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">mode</span><span class="p">:</span><span class="w"> </span><span class="l">primary</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">sync_replicas</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span>- <span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">geode-replica-1.example.com</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">3141</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span>- <span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">geode-replica-2.example.com</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">3141</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">async_replicas</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span>- <span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">geode-dr.us-west.example.com</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">3141</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">lag_threshold</span><span class="p">:</span><span class="w"> </span><span class="l">5m </span><span class="w"> </span><span class="c"># Alert if lag &gt; 5 minutes</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">settings</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">sync_commit</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span><span class="c"># Wait for sync replicas</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">max_lag_bytes</span><span class="p">:</span><span class="w"> </span><span class="l">100MB </span><span class="w"> </span><span class="c"># Max replication lag</span><span class="w"> </span></span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># geode.yaml - DR Site</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">replication</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">mode</span><span class="p">:</span><span class="w"> </span><span class="l">standby</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">upstream</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">host</span><span class="p">:</span><span class="w"> </span><span class="l">geode-primary.us-east.example.com</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">port</span><span class="p">:</span><span class="w"> </span><span class="m">3141</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">restore_command</span><span class="p">:</span><span class="w"> </span><span class="s1">&#39;geode wal-restore %f %p&#39;</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">recovery</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">target_timeline</span><span class="p">:</span><span class="w"> </span><span class="l">latest</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">recovery_target_action</span><span class="p">:</span><span class="w"> </span><span class="l">pause </span><span class="w"> </span><span class="c"># Pause on recovery</span><span class="w"> </span></span></span></code></pre></div> <h4 id="backup-configuration-for-dr" class="position-relative d-flex align-items-center group"> <span>Backup Configuration for DR</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="backup-configuration-for-dr" aria-haspopup="dialog" aria-label="Share link: Backup Configuration for DR"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># geode.yaml</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">backup</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="c"># Local backup (primary site)</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">local</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">/backups/local</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">retention_days</span><span class="p">:</span><span class="w"> </span><span class="m">7</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="c"># S3 backup (same region)</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">s3_primary</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">bucket</span><span class="p">:</span><span class="w"> </span><span class="l">geode-backups-us-east</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">region</span><span class="p">:</span><span class="w"> </span><span class="l">us-east-1</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">retention_days</span><span class="p">:</span><span class="w"> </span><span class="m">30</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="c"># S3 backup (DR region)</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">s3_dr</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">bucket</span><span class="p">:</span><span class="w"> </span><span class="l">geode-backups-us-west</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">region</span><span class="p">:</span><span class="w"> </span><span class="l">us-west-2</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">retention_days</span><span class="p">:</span><span class="w"> </span><span class="m">90</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">storage_class</span><span class="p">:</span><span class="w"> </span><span class="l">STANDARD_IA</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="c"># WAL archiving</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">wal_archive</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">destination</span><span class="p">:</span><span class="w"> </span><span class="l">s3://geode-wal-archive-us-east</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">interval</span><span class="p">:</span><span class="w"> </span><span class="l">1m</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">dr_copy</span><span class="p">:</span><span class="w"> </span><span class="l">s3://geode-wal-archive-us-west</span><span class="w"> </span></span></span></code></pre></div> <h3 id="failover-procedures" class="position-relative d-flex align-items-center group"> <span>Failover Procedures</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="failover-procedures" aria-haspopup="dialog" aria-label="Share link: Failover Procedures"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3> <h4 id="automatic-failover-single-region" class="position-relative d-flex align-items-center group"> <span>Automatic Failover (Single Region)</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="automatic-failover-single-region" aria-haspopup="dialog" aria-label="Share link: Automatic Failover (Single Region)"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>For replica failover within a region:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># geode.yaml</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"></span><span class="nt">high_availability</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">enabled</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">auto_failover</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">failover_timeout</span><span class="p">:</span><span class="w"> </span><span class="l">30s</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">min_replicas</span><span class="p">:</span><span class="w"> </span><span class="m">2</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">health_check</span><span class="p">:</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">interval</span><span class="p">:</span><span class="w"> </span><span class="l">5s</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">timeout</span><span class="p">:</span><span class="w"> </span><span class="l">10s</span><span class="w"> </span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="nt">unhealthy_threshold</span><span class="p">:</span><span class="w"> </span><span class="m">3</span><span class="w"> </span></span></span></code></pre></div><p>The system automatically:</p> <ol> <li>Detects primary failure (missed health checks)</li> <li>Elects new primary from replicas</li> <li>Updates routing configuration</li> <li>Notifies connected clients</li> </ol> <h4 id="manual-failover-cross-region" class="position-relative d-flex align-items-center group"> <span>Manual Failover (Cross-Region)</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="manual-failover-cross-region" aria-haspopup="dialog" aria-label="Share link: Manual Failover (Cross-Region)"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>For planned DR failover:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/bin/bash </span></span></span><span class="line"><span class="cl"><span class="cp"></span><span class="c1"># failover-to-dr.sh</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">set</span> -euo pipefail </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nv">PRIMARY_REGION</span><span class="o">=</span><span class="s2">&#34;us-east&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">DR_REGION</span><span class="o">=</span><span class="s2">&#34;us-west&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Starting DR Failover ===&#34;</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;From: </span><span class="nv">$PRIMARY_REGION</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;To: </span><span class="nv">$DR_REGION</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 1. Verify DR site is ready</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Checking DR site status...&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">DR_STATUS</span><span class="o">=</span><span class="k">$(</span>geode admin status --host geode-dr.us-west.example.com<span class="k">)</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;</span><span class="nv">$DR_STATUS</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 2. Check replication lag</span> </span></span><span class="line"><span class="cl"><span class="nv">LAG</span><span class="o">=</span><span class="k">$(</span>geode admin replication-lag --host geode-dr.us-west.example.com<span class="k">)</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Replication lag: </span><span class="nv">$LAG</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="o">[</span> <span class="s2">&#34;</span><span class="nv">$LAG</span><span class="s2">&#34;</span> -gt <span class="m">300</span> <span class="o">]</span><span class="p">;</span> <span class="k">then</span> <span class="c1"># &gt; 5 minutes</span> </span></span><span class="line"><span class="cl"> <span class="nb">echo</span> <span class="s2">&#34;WARNING: High replication lag. Potential data loss.&#34;</span> </span></span><span class="line"><span class="cl"> <span class="nb">read</span> -p <span class="s2">&#34;Continue? (yes/no): &#34;</span> CONFIRM </span></span><span class="line"><span class="cl"> <span class="o">[</span> <span class="s2">&#34;</span><span class="nv">$CONFIRM</span><span class="s2">&#34;</span> !<span class="o">=</span> <span class="s2">&#34;yes&#34;</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="nb">exit</span> <span class="m">1</span> </span></span><span class="line"><span class="cl"><span class="k">fi</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 3. Stop writes to primary (if accessible)</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Stopping writes to primary...&#34;</span> </span></span><span class="line"><span class="cl">geode admin read-only --host geode-primary.us-east.example.com 2&gt;/dev/null <span class="o">||</span> <span class="nb">true</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 4. Wait for replication to catch up</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Waiting for replication to synchronize...&#34;</span> </span></span><span class="line"><span class="cl">sleep <span class="m">30</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 5. Promote DR to primary</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Promoting DR site to primary...&#34;</span> </span></span><span class="line"><span class="cl">geode admin promote --host geode-dr.us-west.example.com </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 6. Verify promotion</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Verifying promotion...&#34;</span> </span></span><span class="line"><span class="cl">geode admin status --host geode-dr.us-west.example.com </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 7. Update DNS/load balancer</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Updating DNS...&#34;</span> </span></span><span class="line"><span class="cl"><span class="c1"># aws route53 change-resource-record-sets ...</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 8. Notify monitoring</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Sending notification...&#34;</span> </span></span><span class="line"><span class="cl">curl -X POST https://hooks.slack.com/... <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> -d <span class="s1">&#39;{&#34;text&#34;: &#34;DR Failover completed. Active region: &#39;</span><span class="s2">&#34;</span><span class="nv">$DR_REGION</span><span class="s2">&#34;</span><span class="s1">&#39;&#34;}&#39;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Failover Complete ===&#34;</span> </span></span></code></pre></div> <h4 id="emergency-failover" class="position-relative d-flex align-items-center group"> <span>Emergency Failover</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="emergency-failover" aria-haspopup="dialog" aria-label="Share link: Emergency Failover"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>For unplanned primary failure:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/bin/bash </span></span></span><span class="line"><span class="cl"><span class="cp"></span><span class="c1"># emergency-failover.sh</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">set</span> -euo pipefail </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nv">DR_HOST</span><span class="o">=</span><span class="s2">&#34;geode-dr.us-west.example.com&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== EMERGENCY FAILOVER ===&#34;</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;WARNING: Primary is unavailable. Data loss may occur.&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 1. Check DR status</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Checking DR site...&#34;</span> </span></span><span class="line"><span class="cl">geode admin status --host <span class="nv">$DR_HOST</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 2. Get last known state</span> </span></span><span class="line"><span class="cl"><span class="nv">LAST_WAL</span><span class="o">=</span><span class="k">$(</span>geode admin last-wal --host <span class="nv">$DR_HOST</span><span class="k">)</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Last WAL received: </span><span class="nv">$LAST_WAL</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 3. Force promote (no wait for sync)</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Force promoting DR site...&#34;</span> </span></span><span class="line"><span class="cl">geode admin promote --force --host <span class="nv">$DR_HOST</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 4. Verify</span> </span></span><span class="line"><span class="cl">geode admin status --host <span class="nv">$DR_HOST</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 5. Update routing</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Updating DNS/routing...&#34;</span> </span></span><span class="line"><span class="cl"><span class="c1"># Implement DNS update</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 6. Alert</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Sending critical alert...&#34;</span> </span></span><span class="line"><span class="cl"><span class="c1"># Implement alerting</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Emergency Failover Complete ===&#34;</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;IMPORTANT: Document data loss window and investigate primary failure&#34;</span> </span></span></code></pre></div> <h3 id="recovery-procedures" class="position-relative d-flex align-items-center group"> <span>Recovery Procedures</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="recovery-procedures" aria-haspopup="dialog" aria-label="Share link: Recovery Procedures"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3> <h4 id="point-in-time-recovery" class="position-relative d-flex align-items-center group"> <span>Point-in-Time Recovery</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="point-in-time-recovery" aria-haspopup="dialog" aria-label="Share link: Point-in-Time Recovery"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>Recover to a specific timestamp:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/bin/bash </span></span></span><span class="line"><span class="cl"><span class="cp"></span><span class="c1"># pitr-recovery.sh</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nv">BACKUP_SOURCE</span><span class="o">=</span><span class="s2">&#34;s3://geode-backups/production&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">RECOVERY_TARGET</span><span class="o">=</span><span class="s2">&#34;2026-01-28 10:30:00&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">DATA_DIR</span><span class="o">=</span><span class="s2">&#34;/var/lib/geode/data&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Point-in-Time Recovery ===&#34;</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Target: </span><span class="nv">$RECOVERY_TARGET</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 1. Stop server</span> </span></span><span class="line"><span class="cl">sudo systemctl stop geode </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 2. Backup current state</span> </span></span><span class="line"><span class="cl">sudo mv <span class="nv">$DATA_DIR</span> <span class="si">${</span><span class="nv">DATA_DIR</span><span class="si">}</span>.before-recovery-<span class="k">$(</span>date +%Y%m%d-%H%M%S<span class="k">)</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 3. Find appropriate base backup</span> </span></span><span class="line"><span class="cl"><span class="nv">BASE_BACKUP</span><span class="o">=</span><span class="k">$(</span>geode backup --list --dest <span class="nv">$BACKUP_SOURCE</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --before <span class="s2">&#34;</span><span class="nv">$RECOVERY_TARGET</span><span class="s2">&#34;</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --type full <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --format json <span class="p">|</span> jq -r <span class="s1">&#39;.backups[0].id&#39;</span><span class="k">)</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Base backup: </span><span class="nv">$BASE_BACKUP</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 4. Restore base backup</span> </span></span><span class="line"><span class="cl">geode restore <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --source <span class="nv">$BACKUP_SOURCE</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --backup-id <span class="nv">$BASE_BACKUP</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --target <span class="nv">$DATA_DIR</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 5. Apply WAL to target time</span> </span></span><span class="line"><span class="cl">geode restore <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --source <span class="nv">$BACKUP_SOURCE</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --backup-id <span class="nv">$BASE_BACKUP</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --target <span class="nv">$DATA_DIR</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --pitr-timestamp <span class="s2">&#34;</span><span class="nv">$RECOVERY_TARGET</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 6. Verify integrity</span> </span></span><span class="line"><span class="cl">geode verify --data-dir <span class="nv">$DATA_DIR</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 7. Start server in recovery mode</span> </span></span><span class="line"><span class="cl">sudo systemctl start geode </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 8. Verify recovery</span> </span></span><span class="line"><span class="cl">geode query <span class="s2">&#34;MATCH (n) RETURN count(n) as count&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Recovery Complete ===&#34;</span> </span></span></code></pre></div> <h4 id="full-backup-restore" class="position-relative d-flex align-items-center group"> <span>Full Backup Restore</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="full-backup-restore" aria-haspopup="dialog" aria-label="Share link: Full Backup Restore"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><p>Restore from backup after complete loss:</p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/bin/bash </span></span></span><span class="line"><span class="cl"><span class="cp"></span><span class="c1"># full-restore.sh</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nv">BACKUP_SOURCE</span><span class="o">=</span><span class="s2">&#34;s3://geode-backups/production&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">BACKUP_ID</span><span class="o">=</span><span class="s2">&#34;</span><span class="nv">$1</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">DATA_DIR</span><span class="o">=</span><span class="s2">&#34;/var/lib/geode/data&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="o">[</span> -z <span class="s2">&#34;</span><span class="nv">$BACKUP_ID</span><span class="s2">&#34;</span> <span class="o">]</span><span class="p">;</span> <span class="k">then</span> </span></span><span class="line"><span class="cl"> <span class="nb">echo</span> <span class="s2">&#34;Usage: </span><span class="nv">$0</span><span class="s2"> &lt;backup-id&gt;&#34;</span> </span></span><span class="line"><span class="cl"> <span class="nb">echo</span> <span class="s2">&#34;Available backups:&#34;</span> </span></span><span class="line"><span class="cl"> geode backup --list --dest <span class="nv">$BACKUP_SOURCE</span> </span></span><span class="line"><span class="cl"> <span class="nb">exit</span> <span class="m">1</span> </span></span><span class="line"><span class="cl"><span class="k">fi</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Full Restore from Backup ===&#34;</span> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;Backup ID: </span><span class="nv">$BACKUP_ID</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 1. Verify backup exists and is valid</span> </span></span><span class="line"><span class="cl">geode backup --verify --dest <span class="nv">$BACKUP_SOURCE</span> --backup-id <span class="nv">$BACKUP_ID</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 2. Stop server</span> </span></span><span class="line"><span class="cl">sudo systemctl stop geode </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 3. Clear existing data</span> </span></span><span class="line"><span class="cl">sudo rm -rf <span class="nv">$DATA_DIR</span>/* </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 4. Restore</span> </span></span><span class="line"><span class="cl">geode restore <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --source <span class="nv">$BACKUP_SOURCE</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --backup-id <span class="nv">$BACKUP_ID</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --target <span class="nv">$DATA_DIR</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --include-incrementals <span class="c1"># Apply all incrementals</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 5. Verify</span> </span></span><span class="line"><span class="cl">geode verify --data-dir <span class="nv">$DATA_DIR</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 6. Start server</span> </span></span><span class="line"><span class="cl">sudo systemctl start geode </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># 7. Health check</span> </span></span><span class="line"><span class="cl">sleep <span class="m">10</span> </span></span><span class="line"><span class="cl">geode admin status </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;=== Restore Complete ===&#34;</span> </span></span></code></pre></div> <h3 id="dr-testing" class="position-relative d-flex align-items-center group"> <span>DR Testing</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="dr-testing" aria-haspopup="dialog" aria-label="Share link: DR Testing"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3> <h4 id="monthly-dr-test" class="position-relative d-flex align-items-center group"> <span>Monthly DR Test</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="monthly-dr-test" aria-haspopup="dialog" aria-label="Share link: Monthly DR Test"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/bin/bash </span></span></span><span class="line"><span class="cl"><span class="cp"></span><span class="c1"># dr-test-monthly.sh</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nv">TEST_DIR</span><span class="o">=</span><span class="s2">&#34;/tmp/geode-dr-test-</span><span class="k">$(</span>date +%Y%m%d<span class="k">)</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">REPORT_FILE</span><span class="o">=</span><span class="s2">&#34;/var/log/geode/dr-test-</span><span class="k">$(</span>date +%Y%m%d<span class="k">)</span><span class="s2">.log&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">BACKUP_SOURCE</span><span class="o">=</span><span class="s2">&#34;s3://geode-backups/production&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log<span class="o">()</span> <span class="o">{</span> </span></span><span class="line"><span class="cl"> <span class="nb">echo</span> <span class="s2">&#34;[</span><span class="k">$(</span>date +<span class="s1">&#39;%Y-%m-%d %H:%M:%S&#39;</span><span class="k">)</span><span class="s2">] </span><span class="nv">$*</span><span class="s2">&#34;</span> <span class="p">|</span> tee -a <span class="s2">&#34;</span><span class="nv">$REPORT_FILE</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"><span class="o">}</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;=== Monthly DR Test Started ===&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Get latest backup</span> </span></span><span class="line"><span class="cl"><span class="nv">LATEST_BACKUP</span><span class="o">=</span><span class="k">$(</span>geode backup --list --dest <span class="nv">$BACKUP_SOURCE</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --type full --format json <span class="p">|</span> jq -r <span class="s1">&#39;.backups[0].id&#39;</span><span class="k">)</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Testing backup: </span><span class="nv">$LATEST_BACKUP</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Create test directory</span> </span></span><span class="line"><span class="cl">mkdir -p <span class="s2">&#34;</span><span class="nv">$TEST_DIR</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Measure restore time (RTO test)</span> </span></span><span class="line"><span class="cl"><span class="nv">START_TIME</span><span class="o">=</span><span class="k">$(</span>date +%s<span class="k">)</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Starting restore...&#34;</span> </span></span><span class="line"><span class="cl">geode restore <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --source <span class="nv">$BACKUP_SOURCE</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --backup-id <span class="nv">$LATEST_BACKUP</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --target <span class="s2">&#34;</span><span class="nv">$TEST_DIR</span><span class="s2">&#34;</span> &gt;&gt; <span class="s2">&#34;</span><span class="nv">$REPORT_FILE</span><span class="s2">&#34;</span> 2&gt;<span class="p">&amp;</span><span class="m">1</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nv">END_TIME</span><span class="o">=</span><span class="k">$(</span>date +%s<span class="k">)</span> </span></span><span class="line"><span class="cl"><span class="nv">RTO_SECONDS</span><span class="o">=</span><span class="k">$((</span>END_TIME <span class="o">-</span> START_TIME<span class="k">))</span> </span></span><span class="line"><span class="cl"><span class="nv">RTO_MINUTES</span><span class="o">=</span><span class="k">$((</span>RTO_SECONDS <span class="o">/</span> <span class="m">60</span><span class="k">))</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Restore completed in </span><span class="si">${</span><span class="nv">RTO_SECONDS</span><span class="si">}</span><span class="s2">s (</span><span class="si">${</span><span class="nv">RTO_MINUTES</span><span class="si">}</span><span class="s2">m)&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Verify data integrity</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Verifying data integrity...&#34;</span> </span></span><span class="line"><span class="cl">geode verify --data-dir <span class="s2">&#34;</span><span class="nv">$TEST_DIR</span><span class="s2">&#34;</span> &gt;&gt; <span class="s2">&#34;</span><span class="nv">$REPORT_FILE</span><span class="s2">&#34;</span> 2&gt;<span class="p">&amp;</span><span class="m">1</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Start test server</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Starting test server...&#34;</span> </span></span><span class="line"><span class="cl">geode serve <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --data-dir <span class="s2">&#34;</span><span class="nv">$TEST_DIR</span><span class="s2">&#34;</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --listen 127.0.0.1:3142 <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --config-only <span class="p">&amp;</span> </span></span><span class="line"><span class="cl"><span class="nv">SERVER_PID</span><span class="o">=</span><span class="nv">$!</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">sleep <span class="m">10</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Run validation queries</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Running validation queries...&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">NODE_COUNT</span><span class="o">=</span><span class="k">$(</span>geode query <span class="s2">&#34;MATCH (n) RETURN count(n) as count&#34;</span> <span class="se">\ </span></span></span><span class="line"><span class="cl"><span class="se"></span> --server 127.0.0.1:3142 --format json <span class="p">|</span> jq -r <span class="s1">&#39;.rows[0].count&#39;</span><span class="k">)</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Node count: </span><span class="nv">$NODE_COUNT</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Stop test server</span> </span></span><span class="line"><span class="cl"><span class="nb">kill</span> <span class="nv">$SERVER_PID</span> 2&gt;/dev/null </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Cleanup</span> </span></span><span class="line"><span class="cl">rm -rf <span class="s2">&#34;</span><span class="nv">$TEST_DIR</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Generate report</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;=== DR Test Summary ===&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Backup ID: </span><span class="nv">$LATEST_BACKUP</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;RTO: </span><span class="si">${</span><span class="nv">RTO_MINUTES</span><span class="si">}</span><span class="s2"> minutes (target: 5 minutes)&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;RTO Status: </span><span class="k">$(</span><span class="o">[</span> <span class="nv">$RTO_MINUTES</span> -le <span class="m">5</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s1">&#39;PASS&#39;</span> <span class="o">||</span> <span class="nb">echo</span> <span class="s1">&#39;FAIL&#39;</span><span class="k">)</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Data Integrity: VERIFIED&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Node Count: </span><span class="nv">$NODE_COUNT</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Test Status: SUCCESS&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Send report</span> </span></span><span class="line"><span class="cl">cat <span class="s2">&#34;</span><span class="nv">$REPORT_FILE</span><span class="s2">&#34;</span> <span class="p">|</span> mail -s <span class="s2">&#34;Geode DR Test Report - </span><span class="k">$(</span>date +%Y-%m-%d<span class="k">)</span><span class="s2">&#34;</span> [email protected] </span></span></code></pre></div> <h4 id="quarterly-full-dr-drill" class="position-relative d-flex align-items-center group"> <span>Quarterly Full DR Drill</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="quarterly-full-dr-drill" aria-haspopup="dialog" aria-label="Share link: Quarterly Full DR Drill"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/bin/bash </span></span></span><span class="line"><span class="cl"><span class="cp"></span><span class="c1"># dr-drill-quarterly.sh</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># This script performs a full DR drill including:</span> </span></span><span class="line"><span class="cl"><span class="c1"># 1. Simulated primary failure</span> </span></span><span class="line"><span class="cl"><span class="c1"># 2. DR site promotion</span> </span></span><span class="line"><span class="cl"><span class="c1"># 3. Application failover</span> </span></span><span class="line"><span class="cl"><span class="c1"># 4. Data validation</span> </span></span><span class="line"><span class="cl"><span class="c1"># 5. Failback</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nv">DRILL_ID</span><span class="o">=</span><span class="s2">&#34;drill-</span><span class="k">$(</span>date +%Y%m%d-%H%M%S<span class="k">)</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"><span class="nv">LOG_FILE</span><span class="o">=</span><span class="s2">&#34;/var/log/geode/dr-drill-</span><span class="nv">$DRILL_ID</span><span class="s2">.log&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log<span class="o">()</span> <span class="o">{</span> </span></span><span class="line"><span class="cl"> <span class="nb">echo</span> <span class="s2">&#34;[</span><span class="k">$(</span>date +<span class="s1">&#39;%Y-%m-%d %H:%M:%S&#39;</span><span class="k">)</span><span class="s2">] </span><span class="nv">$*</span><span class="s2">&#34;</span> <span class="p">|</span> tee -a <span class="s2">&#34;</span><span class="nv">$LOG_FILE</span><span class="s2">&#34;</span> </span></span><span class="line"><span class="cl"><span class="o">}</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;=== Quarterly DR Drill: </span><span class="nv">$DRILL_ID</span><span class="s2"> ===&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;This drill will:&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;1. Put primary in read-only mode&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;2. Promote DR site&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;3. Run validation tests&#34;</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;4. Fail back to primary&#34;</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="nb">read</span> -p <span class="s2">&#34;Proceed with DR drill? (yes/no): &#34;</span> CONFIRM </span></span><span class="line"><span class="cl"><span class="o">[</span> <span class="s2">&#34;</span><span class="nv">$CONFIRM</span><span class="s2">&#34;</span> !<span class="o">=</span> <span class="s2">&#34;yes&#34;</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> <span class="nb">exit</span> <span class="m">1</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Phase 1: Simulate failure</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Phase 1: Simulating primary failure...&#34;</span> </span></span><span class="line"><span class="cl"><span class="c1"># ... implementation</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Phase 2: Promote DR</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Phase 2: Promoting DR site...&#34;</span> </span></span><span class="line"><span class="cl"><span class="c1"># ... implementation</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Phase 3: Validate</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Phase 3: Running validation...&#34;</span> </span></span><span class="line"><span class="cl"><span class="c1"># ... implementation</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="c1"># Phase 4: Failback</span> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;Phase 4: Failing back to primary...&#34;</span> </span></span><span class="line"><span class="cl"><span class="c1"># ... implementation</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">log <span class="s2">&#34;=== DR Drill Complete ===&#34;</span> </span></span></code></pre></div> <h3 id="runbooks" class="position-relative d-flex align-items-center group"> <span>Runbooks</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="runbooks" aria-haspopup="dialog" aria-label="Share link: Runbooks"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3> <h4 id="runbook-primary-server-failure" class="position-relative d-flex align-items-center group"> <span>Runbook: Primary Server Failure</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="runbook-primary-server-failure" aria-haspopup="dialog" aria-label="Share link: Runbook: Primary Server Failure"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h4><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-markdown" data-lang="markdown"><span class="line"><span class="cl"><span class="gh"># Runbook: Primary Server Failure </span></span></span><span class="line"><span class="cl"><span class="gh"></span> </span></span><span class="line"><span class="cl"><span class="gu">## Symptoms </span></span></span><span class="line"><span class="cl"><span class="gu"></span><span class="k">-</span> Primary server unreachable </span></span><span class="line"><span class="cl"><span class="k">-</span> Health checks failing </span></span><span class="line"><span class="cl"><span class="k">-</span> Client connection errors </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="gu">## Immediate Actions </span></span></span><span class="line"><span class="cl"><span class="gu"></span> </span></span><span class="line"><span class="cl"><span class="k">1.</span> <span class="gs">**Verify failure**</span> </span></span><span class="line"><span class="cl"> ```bash </span></span><span class="line"><span class="cl"> ping geode-primary.example.com </span></span><span class="line"><span class="cl"> geode admin status --host geode-primary.example.com </span></span></code></pre></div><ol start="2"> <li> <p><strong>Check replica status</strong></p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">geode admin status --host geode-replica-1.example.com </span></span><span class="line"><span class="cl">geode admin status --host geode-replica-2.example.com </span></span></code></pre></div></li> <li> <p><strong>Automatic failover should occur</strong></p> <ul> <li>If auto-failover enabled, new primary elected within 30s</li> <li>Verify: <code>geode admin cluster-status</code></li> </ul> </li> <li> <p><strong>If auto-failover fails</strong></p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">geode admin promote --host geode-replica-1.example.com </span></span></code></pre></div></li> <li> <p><strong>Update monitoring</strong></p> <ul> <li>Acknowledge alert</li> <li>Create incident ticket</li> </ul> </li> </ol> <h3 id="recovery" class="position-relative d-flex align-items-center group"> <span>Recovery</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="recovery" aria-haspopup="dialog" aria-label="Share link: Recovery"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3><ol> <li>Investigate root cause</li> <li>Repair/replace failed server</li> <li>Rejoin as replica</li> <li>Conduct post-incident review</li> </ol> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">### Runbook: Data Corruption </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">```markdown </span></span><span class="line"><span class="cl"># Runbook: Data Corruption Detected </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">## Symptoms </span></span><span class="line"><span class="cl">- Query errors: &#34;checksum mismatch&#34; </span></span><span class="line"><span class="cl">- Unexpected query results </span></span><span class="line"><span class="cl">- Verification failures </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">## Immediate Actions </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">1. **Stop writes** </span></span><span class="line"><span class="cl"> ```bash </span></span><span class="line"><span class="cl"> geode admin read-only </span></span></code></pre></div><ol start="2"> <li> <p><strong>Identify corruption scope</strong></p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">geode verify --data-dir /var/lib/geode/data --verbose </span></span></code></pre></div></li> <li> <p><strong>Check backup status</strong></p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">geode backup --list --dest s3://geode-backups </span></span></code></pre></div></li> <li> <p><strong>Determine recovery point</strong></p> <ul> <li>Last known good backup</li> <li>Or PITR to before corruption</li> </ul> </li> <li> <p><strong>Perform recovery</strong></p> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">./pitr-recovery.sh <span class="s2">&#34;2026-01-28 09:00:00&#34;</span> </span></span></code></pre></div></li> </ol> <h3 id="post-recovery" class="position-relative d-flex align-items-center group"> <span>Post-Recovery</span> <button type="button" class="h-share btn btn-link p-0 text-decoration-none link-secondary opacity-50 hover-opacity-100 transition-all ms-1" data-share-target="post-recovery" aria-haspopup="dialog" aria-label="Share link: Post-Recovery"> <i class="fa-sharp-duotone fa-solid fa-share-nodes" aria-hidden="true" style="font-size: 0.8em;"></i> <span class="visually-hidden">Share link</span> </button> </h3><ol> <li>Validate data integrity</li> <li>Resume writes</li> <li>Investigate root cause</li> <li>Review and enhance monitoring</li> </ol> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-fallback" data-lang="fallback"><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">## Best Practices </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">### DR Planning </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">1. **Define RTO/RPO**: Match business requirements </span></span><span class="line"><span class="cl">2. **Document procedures**: Detailed runbooks </span></span><span class="line"><span class="cl">3. **Automate where possible**: Reduce human error </span></span><span class="line"><span class="cl">4. **Regular testing**: Monthly tests, quarterly drills </span></span><span class="line"><span class="cl">5. **Update procedures**: After every change </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">### Replication </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">1. **Use sync replication for zero RPO**: Within region </span></span><span class="line"><span class="cl">2. **Use async for cross-region**: Accept lag tradeoff </span></span><span class="line"><span class="cl">3. **Monitor replication lag**: Alert on threshold </span></span><span class="line"><span class="cl">4. **Test failover regularly**: Validate automation </span></span><span class="line"><span class="cl">5. **Consider network latency**: For cross-region </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">### Backup Strategy </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">1. **3-2-1 rule**: 3 copies, 2 media, 1 offsite </span></span><span class="line"><span class="cl">2. **Automate backups**: No manual intervention </span></span><span class="line"><span class="cl">3. **Verify backups**: Regular integrity checks </span></span><span class="line"><span class="cl">4. **Test restores**: Monthly at minimum </span></span><span class="line"><span class="cl">5. **Encrypt backups**: At rest and in transit </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">### Documentation </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">1. **Maintain runbooks**: Step-by-step procedures </span></span><span class="line"><span class="cl">2. **Include contact info**: Escalation paths </span></span><span class="line"><span class="cl">3. **Version control**: Track changes </span></span><span class="line"><span class="cl">4. **Regular review**: Update quarterly </span></span><span class="line"><span class="cl">5. **Accessible offline**: DR docs available during outage </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">## Related Documentation </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl">- **[Backup Procedures](/docs/operations/backup/)** - Backup configuration and procedures </span></span><span class="line"><span class="cl">- **[Monitoring](/docs/operations/monitoring/)** - DR-related monitoring </span></span><span class="line"><span class="cl">- **[Multi-Datacenter Guide](/docs/guides/multi-datacenter/)** - Multi-DC deployment </span></span><span class="line"><span class="cl">- **[High Availability](/docs/architecture/distributed-architecture/)** - HA architecture </span></span></code></pre></div>