Best practice for multiple hosts to single repository

mdw · May 5, 2019, 6:52pm

I am taking advantage of restic deduplication by sharing a single repo with several hosts that have the same development source trees, etc. Works well!

A systemd service scripts run restic forget, restic prune, and restic check on a schedule for each host. My question is: is it redundant and wasteful to run these on every host? E.g. it seems that restic forget checks the snapshots of every host in the repo, so I’m guessing that I’m wasting resources.

Does anyone have a best practice or recommended strategy for this use case?

rawtaz · May 5, 2019, 7:07pm

Correct - forget, prune and check acts on the entire repo, so you don’t need to run this on/for every host that backs up to the repo, instead run it “per repo”.

What I do when I backup multiple hosts to a single repo is to set a host-specific tag for each host, e.g. mac10 and mac24. I can then use the --group-by argument of the forget command like --group-by tags,path to make sure that the --keep-* policies I define are applied individually for every host in that repo.

mdw · May 5, 2019, 7:37pm

Thanks for confirming that! Makes sense. And thanks for the suggestions about group-by and tag.

If I understand the docs correctly, group-by defaults are host,path. So a forget will be applied to all snapshots in the repository grouping by host then path be default. However, the --tag mytag option seems to be restrict to snapshots with the tag mytag.

I’ve set the tag for restic forget to be systemd.backup so I seem to be operating on this tag only. I could append the host name (i.e. systemd.host) Then, the use of forget --tag systemd.host would restrict to that host only I assume.

Not sure what is best. My inclination is to simply leave it the way I have it but only run forget and prune from one host.

rawtaz · May 5, 2019, 7:51pm

Right, but if you do forget --tag systemd.host (where I presume that the “host” part is individual for every host you back up to the repo) then only snapshots for that host will be considered in the forget operation. The other hosts snapshots won’t be forgotten/processed.

If on the other hand you do it the way I suggested forget --group-by tags,paths), you will have the forget operation be applied for every host in one go. It will be as if you did a forget once for each host separately (grouping by paths).

mdw · May 5, 2019, 7:58pm

Got it, I think: let one host do forget, prune, check on the entire repository. All other hosts simply do backup.

But doesn’t it make more sense to do --tag mytag --group-by hosts,path? That way, there can be snapshots of other stuff that is not affected by forget, right?

rawtaz · May 5, 2019, 8:10pm

Sure, you can set multiple tags.