Restic repository advanced mirror between local and off-site copies

Hi,

I would like to run periodic backups/snapshots with specific tags and different retention periods for both local and 2 x off-site snapshots.

Tag Local off-site
bi-daily 3 30
nightly 3 90
weekly 4 104
monthly 2 24

(the numbers above are the number of snapshots to keep for each tag both locally and on remote (respectively)

Local will be on HDD.
off-site will be a remote cloud storage, like B2/OneDrive/Google Drive (to be decided).

What is the best way to implement this?

Thanks,

First of all, please note that you can use the --dry-run option to the restic forget command to test forget policies without actually forgetting any snapshots. This is probably a good idea to do now that you’re trying to establish your policies and thereby the corresponding commands to use.

Next, please check the forget options by running restic help forget, you can also check: Removing backup snapshots — restic 0.12.1 documentation

So, if you want to apply different policies to different tags, you probably need to use the --tag option with the tag you want to apply a policy to, so that you only apply the policy and forget snapshots that has this tag on them. Then you’d run one forget command for each tag.

Once you have that baseline, it should just be a matter of adding the proper --keep-* options for each of the tags you want. E.g. to keep a snapshot every four weeks back you add the --keep-weekly 4 option, to keep a snapshot every month for two months back add --keep-monthly 3, and so on. The --dry-run will let you see which snapshots restic would forget and which ones it would keep. Just try it and see :slight_smile:

So in summary, for each of your repositories, you would run one forget command (with appropriate options) for each of the tags. So that would be 2 (repositories) * 4 (tags) = 8 forget commands. Note that after that you only need to run one prune command per repository though.

Does that make sense?

1 Like

thanks that really helps.

So, every time restic -r /srv/restic-repo backup is done, I’m considering my options with mirroring that over off-site.

  1. rclone sync
  2. restic copy

Since, both local and remote will have different snapshots, I guess rclone sync is not an option. So, I read this, which confirms my theory.

What’s the most efficient way of using restic copy in this scenario?

Thoughts

Would the restic copy be more efficient if the last few snapshots are the exact same? More so, there’re more snapshots in the destination (off-site/remote) than local.

If I go with restic copy:

  1. If both local and remote repos have the same encryption key, could I avoid decryption and re-encryption?
  2. How could I avoid restic copy crawling through everything?
  3. Could I just do restic -r /srv/restic-repo copy --repo2 rclone:remote after every backup run.

Thanks again in advance,

Kindest regards,

  • There is no option to avoid reencryption. However, most somewhat modern CPUs have hardware acceleration for AES which means that restic would be able to reencrypt a few gigabytes per second. Thus this is not a bottleneck.
  • restic copy will only copy snapshots and data chunks which are not present in the destination repository
  • Running copy after each backup run should work well. If the last few snapshots are identical, than copy will just copy the snapshot but nothing else.

The performance of restic copy is not optimized yet, but will be once Speed-up copy command by MichaelEischer · Pull Request #3513 · restic/restic · GitHub is merged.

2 Likes

Thanks @MichaelEischer, that’s helpful to know.

Let’s say the destination is OneDrive via rclone, how will it find which are (and not) present in the destination.

Thanks

will it go through each file to find which exists in the destination?

copy adds a mark to snapshots when copying them, which allows it to recognize the already copied snapshots in later runs. It will have to read each file in the snapshots folder of the repository once to do that, but that should be reasonably fast, especially when cached. For the data chunks in the repository, restic uses an index which lists which chunks exist in a repository. Then copy just has to check whether a chunk exists in the target repository and copy it if it’s not the case.

1 Like

wow amazing! Thanks!

Out of interest, the mark is added in the source repo or the destination repo?

So, for the subsequent runs, it only has to download the index from OneDrive and it’ll send the ones that it doesn’t have?

Thanks,

The mark is part of the snapshot created in the destination repository.

restic by default caches the index and some other metadata for a repository. The checks which data has to be copied is done solely based on that metadata. That is with an enabled cache restic should most of the time be able to avoid downloading data from the destination repository. (It will have to download a few small files, but not much).

1 Like