Hello everybody
During using restic my all day life, I’ve found that if I add a backup-location to restic, it rereads the whole file system:
Just assume the following scenario:
(A)Normally I backup with the following command:
restic backup /home /media/disk1 /media/disk2 /media/disk3
(B)Now, once in a while I change the command to:
restic backup /home /media/disk1 /media/disk2 /media/disk3 /media/disk4
This means, that if I run (B) a quite old parent is used, or even if disk4(Maybe disk4 are some usb-sticks or so) isn’t static the whole filesystem is reread.
So therefore I suggest a changing in the parent snapshot choosing with the implementation of a norm.
This whole text is based on the assumption, that the parent snapshot is choosing the following way: Same machine and same backup locations. (As I once read in this forum.)
How do I consider this to work:
Option 1:
First of all we create the vector space V, in which every dimension represents exactly one unique backup location and one dimension is for the time.
Let’s take the vector(1) of what has to be backed up and assign each backup location a one and then we add one element: the time. So we have the vector(K). We then take the vectors of all snapshots in the repository(X_i, with i in the set of all integers). While each backup location gets a different dimension in the vector. Then we calculate the vectors Y_i = K- X_i and then we take its length. (Time has to be manipulated before that it makes sense. Maybe multiply it with the unit: 1 Snapshot/week) And then we choose as the parent snapshot the snapshot with the lowest norm.
It has also to be noted, that the Y_i vectors filled with plus or minus one beside the time, should be thrown away.
Option 2:
Similar to Option 1, but with the difference that the vector space consists of the number of files (or size) in each backup location and not just ones or zeros.
Conclusion
This implies, that the upper case may get as a parent (A) and not the old (B). In particular option 2 could reduce the backup time a lot in that use case.
This is yet a theoretical concept.
Kind regards,