Restic-aware "copy" mechanism

#1

My backup process for some of my systems looks like this:

  • Take hourly backups to an on-site store.
  • Prune daily, keeping 48 hourly backups, and some number of daily/weekly/monthly backups.

So far, so good. But I want to keep an off-site copy, and in particular I don’t want the on-site systems to be able to push to that off-site copy, for security reasons. This is implemented using rclone copy --immutable on the off-site host to pull all data from the on-site copy, but not allow any data to change or be deleted.

This works but it does not work well. What I mean is that the off-site repository is well-formed (after pruning to remove duplicate objects) but after some time the two repositories have almost no packs in common anymore, since they don’t prune at the same times or have the same retention policies. Eventually, every time the off-site copy pulls from the on-site copy, it has to copy everything, only to have the subsequent prune deduplicate nearly everything that was copied. Since my on-site repository is in AWS, this is becoming expensive.

It would be useful if restic had some operation “from this remote repository, take some set of snapshots and add the necessary data to this repository.” This operation should, of course, deduplicate with the local repository, effectively only transferring new blobs. The set of snapshots to copy could be explicitly specified as IDs, or perhaps offer some switches that mean “copy all snapshots” or “copy snapshots newer than X” or “copy snapshots created in the last N hours.” (Obviously, snapshots that already exist in the local repository don’t have to be copied and can be ignored.)

One potential problem that needs to be solved is how to supply a second repository password. Of course, restic could first try the local repository password on the remote repository under the assumption that they were originally copies of each other, or that the administrator sees no need to use different passwords for the same data.

This is the last piece that would make my setup perfect.

0 Likes

#2

What you are asking for has partly been discussed in #323 with the difference that in your scenario restic needs to pull the backups.

IMHO, the fact that restic makes pull backups so difficult has always been one of the biggest drawbacks of restic.

1 Like

#3

Looks like there is a branch someone was working on implementing exactly this feature. Maybe I’ll take a look and see if it still applies to master.

Note that this isn’t a “pull backup,” it’s just importing data from one repository to another in a more efficient way than rclone can.

0 Likes

#4

If I understood correctly your on-site restic doesn’t have access to the off-site data, which means your off-site restic needs to pull the data from your on-site. Correct? While it technically might not be a backup it still needs to pull the data somehow. And as far as I know restic was built around pushing data.

Hope this explains what I wanted to say :upside_down_face:

0 Likes

#5

Right, but restic doesn’t even have an operation for this in the reverse (send snapshot X from my local repository to a remote repository). Transfer between repositories is what this query is about, which isn’t currently supported in either direction.

0 Likes