I’ve been using GDA with restic for almost a year. For my workload, this approach dramatically reduced my storage costs, but I recognize that a restore will be more expensive and some workloads might not achieve all the storage cost savings I did.
This is my approach:
- I use restic to backup to a locally-hosted repository every night.
- After the backup is complete, I use
aws s3 sync
to sync the files to an S3 bucket. I use the--delete
flag to remove files from the bucket if they disappear from my local repo. This is very efficient from a bandwidth and cost perspective. - The s3 bucket has lifecycle rules that move files in the
data/
directory to GDA after 10 days. This delay reduces GDA accesses and deletions for files that live in my repo only briefly. (The migration does not impact future sync operations, nor do those operations trigger restores of GDA data.) - I prune and check my local repo weekly. These actions are automatically synced to the cloud.
Pros of this approach:
- Very low cost cloud storage
- Very reliable backup process
- Very high data durability (w/copies locally and in the cloud)
- Prunes and restores are fast and cheap because they are performed against a local repo
Cons:
- Requires local storage devices large enough to hold the repository
- If the local repository is lost, restoring from GDA will be expensive
- If the local repository is corrupted (e.g. ransomware, restic bugs), those corruptions will be synced to the cloud.