Looking for information about restic strategies

gctwnl · January 4, 2024, 4:51pm

I am finally getting around to creating my restic setup. I will be using a local and a B2 location for the data. I will probably add a restic server later for some family members to use me as their backup storage.

Having played around with a B2 storage (everything works) I am now at the point I have to design the backups. I.e. do I use a single repository or multiple? It seems to me that I can do everything with a single repository except have separate passwords. But since I am the only one backing up (I am the admin), I could organise everything by using host and tag.

I will have several backups (I think the correct restic term is snapshot, right?). Some will have to backup every 30 minutes or so to safeguard against loss of volatile data, others maybe once a week. Some will have a different ‘forget’ and pruning strategy.

What are the things I have to pay attention to?
Are there pros and cons for using multiple repositories? Is more deduplication maybe a pro of a single repo? Other pros or cons?

atdotcom · January 11, 2024, 10:28am

I do not share repositories across machines.
I have tried testing this a bit to measure storage usage differences; and while deduplication should make it possible to save a bit more, I was not able to measure much decrease in effective storage. I believe the reason for this is that the only thing that are truly identical across hosts are system files, etc. And often I exclude those from backups in the first place. And when I do not; those files do not take up a large fraction of the overall storage anyway. So most of what takes up space is the user data which is mostly unique per host.

Also, as you have noticed, sharing repos introduces a shared secret and a single point of failure in exposing your data. I think in most situations it is best to keep things isolated such that if one key is exposed, then only data from one host is potentially exposed.

gctwnl · January 11, 2024, 10:47am

Thank you. Understandable. I have a couple of machines that are ‘mobile clients’ which synchronise via a central point and which are all under my administrative control. Having all these and the central point in a repo would profit from the fact that these are many technically independent copies of user data. But given that I already backup the central machine, the use case for backing them up as well is limited (when syncing starts to fail, between sync and next backup). Food for thought.

atdotcom · January 11, 2024, 11:40am

I think I have a use case that is somewhat similar in that I also manage a fleet of laptops that all have some shared contents (e.g. git repos).
However, the actual real world storage savings I got from my tests with shared repose were insignificant. And the added complexity and bigger repo (it takes longer to do a recovery test, or a full check, etc.) was not a tradeoff worth doing in my situation. YMMV.