I’m building a prototype of an backup system using restic. I need to backup something like this:
Due to the high volume of folders/files to backup, I’ve set up a set of N nodes running restic and they are randomly picking different paths each time (I track the individual status of the subfolders using a centralized database to know which ones need to be backup). All mount points are present in all the restic nodes.
I wanted to make the system more scalable so for now I’ve disabled the cache to make restic nodes stateless (the results are pretty good even with cache disabled! ) but I have to questions related to this environment:
- In case I enable the cache, what happens in the following situation?
node1 --> first time backup of /mount1/folderA a and creates the cache (local)
node2 --> will try to do the backup /mount1/folderA, it’s the first time in this node so it will create the cache again (local)
node1 --> will try to backup again and there is a local cache, but is not the cache of the last copy, does this offers any benefit or causes any problem?
The second question is regarding the hostname, which is different depending of the node that launches the backup, so I have this kind of output in restic snapshots.
e2ceec93 2019-01-16 16:56:51 cbox-restic /mount1/folderA d4f85657 2019-01-17 11:27:12 cbox-restic-2 /mount1/folderA de811864 2019-01-18 09:20:35 cbox-restic /mount1/folderA
which implications have this when trying to recover or applying retention policies? should I create a tag or manually specify the hostname? I am using one repository per sub folder so the path will be always the same in each repo.
Thank you a lot for you advice and congrats for this beautiful work!!!