# Backup, RPO, and RTO Runbook

Status: operational runbook.

## Target

- Pilot RPO: 5 minutes when Litestream replication is configured and healthy.
- Pilot RTO: 4 hours after an operator has Railway and bucket access.

## Runtime Configuration

- `LITESTREAM_REPLICA_URL`: required for active SQLite replication. Production uses `gs://...` or `s3://...`.
- `GCS_KEY_JSON` or `GOOGLE_APPLICATION_CREDENTIALS`: required for GCS replicas.
- `DATABASE_URL`: production SQLite path, currently mounted under `/data`.
- `BACKUP_MAX_LAG_SECONDS`: maximum accepted lag between the latest database write and latest Litestream replica object; default is 300 seconds.

## Operating Control

- Hourly monitor: GitHub Actions workflow `.github/workflows/backup-control.yml` checks `GET /health/detailed`, verifies backup replica status, and scans recent Railway logs for Litestream errors.
- Weekly restore drill: the same workflow restores the production replica to a disposable SQLite file every Monday at 09:23 UTC.
- Evidence: successful restore drills append a dated entry to `tests/product-validation/backup-restore-drills.md` and commit it to `main`.
- Alerting: workflow failures fail the GitHub run and can notify `OPS_ALERT_WEBHOOK_URL` when that secret is configured.

## Restore Drill

1. Confirm `GET /health/detailed` reports `litestream_replication_configured: true`.
2. Prefer the scheduled workflow or run `node scripts/ops/backup-control.mjs restore` with production Railway variables.
3. Restore to a disposable file, never over the live database:
   `litestream restore -o /tmp/annotate-restore-drill.db gs://BUCKET/litestream/annotate.db`
4. Validate schema and key counts:
   `sqlite3 /tmp/annotate-restore-drill.db "SELECT COUNT(*) FROM users; SELECT COUNT(*) FROM projects; SELECT COUNT(*) FROM reports;"`
5. Record the drill date, source replica URL, restore duration, and row-count evidence in `tests/product-validation/backup-restore-drills.md`.
6. Delete the disposable restore file.

References: Litestream restore reference, https://litestream.io/reference/restore/
