At my work at the Dutch natural history institute (Naturalis) we’re looking at candidates to replace our current backup solution. We’re looking at backup solutions that are open source and work well with both private and public cloud storage backends. Restic checks most of our boxes and is our primary candidate.
At the moment we’re determining what would be the best way to organize our backups with restic. We have quite a few number of servers (300+) that host data for which we need to make backups. Some of those servers are fileservers with fileshares containing large numbers of files.
We’re looking for an organization of our backups that:
- Is optimized for overall deduplication.
- Reduces the risk of data corruption (a huge number of backup processes in one repo might increase the risk of locking issues).
- Makes it easy to move a file share from one file server to another, while keeping the relation to the existing backup set.
Roughly we see two scenarios:
- We use a small amount of repositories and use those for a big number of backups (provided with appropriate tags).
- We use a big amount of repositories and use those for a relatively small number of backups.
I can imagine there aren’t any strict rules, but I’m interested in any of your experiences and tips.