# ๐ Disaster Recovery Guide
## Overview
Complete guide for recovering your Copilot CLI+ environment after losing the primary host.
## ๐ Recovery Scenarios
### Scenario 1: Complete Host Loss
**Situation:** Primary host is down, you need to restore on a new system.
**Recovery Steps:**
```bash
# 1. Get the backup (from secure storage, git, S3, etc)
git clone <your-backup-repo>
cd copilot-backup
# 2. Get API key from secure storage
# (1Password, Vault, encrypted file, etc)
export DEEPSEEK_API_KEY="sk-your-key"
# 3. Run recovery on new system
bash RECOVERY.sh "$DEEPSEEK_API_KEY"
# 4. Verify
curl http://localhost:8888/health
```
### Scenario 2: Quick Migration
**Situation:** Moving to a different server/cloud provider.
**Recovery Steps:**
```bash
# 1. On old system, create backup
bash /opt/local-agent/backup.sh
# Follow prompts to upload
# 2. On new system
ssh root@newhost "bash backup/RECOVERY.sh 'sk-your-key'"
# 3. Point DNS/references to new host
```
### Scenario 3: Multi-Machine Deployment
**Situation:** Need same setup on multiple machines.
**Deployment Steps:**
```bash
# 1. Create backup on first machine
bash /opt/local-agent/backup.sh
# 2. Push to your backup repo
git add copilot-backup && git push
# 3. Deploy to other machines
for host in host1 host2 host3; do
ssh $host << 'CMD'
git clone <your-backup-repo>
bash copilot-backup/RECOVERY.sh "sk-key"
CMD
done
```
## ๐ฆ What to Backup
**Essential (Must Have):**
- โ
`bootstrap.sh` - Complete setup automation
- โ
Custom configurations (if any)
- โ
API key (stored separately)
**Important (Should Have):**
- โ
Custom Ansible playbooks
- โ
Documentation
- โ
Systemd service configurations
**Optional:**
- Logs (can be regenerated)
- Downloaded models (can be re-pulled)
## ๐ Secure API Key Storage
### Option 1: Encrypted File
```bash
# Encrypt
openssl enc -aes-256-cbc -salt -in deepseek-key.txt -out deepseek-key.enc
# Store encrypted version in repo
# Keep passphrase elsewhere
# Decrypt during recovery
openssl enc -d -aes-256-cbc -in deepseek-key.enc
```
### Option 2: 1Password / LastPass / Vault
```bash
# Retrieve before recovery
op read op://vault/deepseek-api-key
# Use in recovery
bash RECOVERY.sh "$(op read op://vault/deepseek-api-key)"
```
### Option 3: Environment Secret (GitHub/GitLab)
```bash
# Add to Actions secrets
DEEPSEEK_API_KEY=sk-xxx
# Retrieve in CI/CD
bash RECOVERY.sh "${{ secrets.DEEPSEEK_API_KEY }}"
```
## ๐ Automated Backup Strategy
### Setup Automated Backups
```bash
# 1. Create backup script
cat > /usr/local/bin/backup-copilot.sh << 'EOF'
#!/bin/bash
bash /opt/local-agent/backup.sh
# Then upload to git/S3
EOF
chmod +x /usr/local/bin/backup-copilot.sh
# 2. Schedule with cron (daily)
(crontab -l 2>/dev/null; echo "0 2 * * * /usr/local/bin/backup-copilot.sh") | crontab -
# 3. Or use systemd timer
cat > /etc/systemd/system/copilot-backup.service << EOF
[Unit]
Description=Copilot Backup
After=network.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup-copilot.sh
User=root
EOF
cat > /etc/systemd/system/copilot-backup.timer << EOF
[Unit]
Description=Daily Copilot Backup
Requires=copilot-backup.service
[Timer]
OnCalendar=daily
OnBootSec=10m
[Install]
WantedBy=timers.target
EOF
systemctl daemon-reload
systemctl enable --now copilot-backup.timer
```
## โ
Recovery Checklist
- [ ] Bootstrap script available
- [ ] Configuration files backed up
- [ ] API key stored securely (separately)
- [ ] Recovery script tested
- [ ] Backup stored in multiple locations
- [ ] Team knows where to find backup
- [ ] Tested recovery on staging system
## ๐งช Test Recovery Periodically
```bash
# Once a month, test:
# 1. On staging system
bash RECOVERY.sh "sk-test-key"
# 2. Verify all endpoints
curl http://localhost:8888/health
curl http://localhost:8888/services
curl -X POST http://localhost:8888/deepseek \
-d '{"query":"Test"}'
# 3. Test Copilot CLI integration
copilot
/mcp
# Try commands
# 4. Clean up
rm -rf /opt/local-agent
systemctl stop local-agent-api ollama local-agent
```
## ๐ Quick Reference
### Backup
```bash
# Create backup
bash /opt/local-agent/backup.sh
# Compress and upload
cd copilot-backup-*
tar -czf ../backup.tar.gz .
scp ../backup.tar.gz backup-server:/secure/storage/
```
### Recovery
```bash
# Simple recovery
bash RECOVERY.sh "sk-your-key"
# With custom configs
bash RECOVERY.sh "sk-your-key"
tar -xzf custom-configs.tar.gz -C /opt/local-agent/config/
systemctl restart local-agent-api
```
### Verify
```bash
# After recovery
curl http://localhost:8888/health
systemctl status local-agent-api ollama ssh
tail /opt/local-agent/logs/api.log
```
## ๐ฏ Best Practices
1. **Separate API Key from Backup**
- Never commit API key to version control
- Store in encrypted format or secrets manager
- Regenerate key if backup is compromised
2. **Test Recovery Regularly**
- Monthly test on staging system
- Document any issues
- Update procedures accordingly
3. **Multiple Backup Locations**
- Local storage
- Cloud storage (S3, Azure, etc)
- Git repository (without secrets)
- USB drive (for air-gapped recovery)
4. **Document Everything**
- Keep runbook updated
- Document any customizations
- Record API key location (securely)
5. **Automate Backup Process**
- Use cron or systemd timer
- Verify backups regularly
- Alert on backup failures
## ๐จ Emergency Recovery
If everything is lost and you need to recover NOW:
```bash
# 1. Access any machine with internet
wget https://raw.githubusercontent.com/your-repo/master/bootstrap.sh
chmod +x bootstrap.sh
# 2. Run with API key from your password manager
./bootstrap.sh "sk-from-password-manager"
# 3. Restore custom configs if available
# Download from cloud storage
tar -xzf custom-configs.tar.gz -C /opt/local-agent/
# 4. Restart and verify
systemctl restart local-agent-api
curl http://localhost:8888/health
```
## ๐ Recovery Time Estimates
| Scenario | Bootstrap | Tests | Custom Config | Total |
|----------|-----------|-------|---|-------|
| Clean Install | 2-3 min | 1 min | 1 min | **4-5 min** |
| From Backup | 2-3 min | 1 min | <1 min | **3-4 min** |
| Manual Steps | 15-20 min | 5 min | 10 min | **30-35 min** |
**Bootstrap is 6-10x faster than manual setup!**
## ๐ก Pro Tips
1. Keep bootstrap script in multiple places:
- Git repository
- S3 bucket
- Gist
- Portable drive
2. Test recovery with different API keys:
- Main key
- Backup key
- Revoked key (should fail gracefully)
3. Document any customizations:
- Custom Ansible playbooks
- Modified configs
- Non-standard ports
4. Have a rollback plan:
- Previous version of bootstrap
- Known-good configurations
- Database snapshots (if applicable)
---
**Prepared:** 2026-03-25
**Status:** Production Ready
**Next Review:** Quarterly