Restic incremental backup report confusing

Great question! First, deduplication is working just fine, the second backup only added ~215MiB to the repository.

In order to decide if it needs to re-read a file it has already seen, restic first needs to find a snapshot that contains the file (that’s the parent snasphot mentioned above). We’re still working on that function, but for now you need to pass exactly the same list of files/dirs again. Otherwise restic will fall back to the safe default of re-reading everything.

When you ran restic the second time, it did not find a snapshot of exactly the list of directories you passed to it. On the first run, the list was:

~/AUFs ~/Downloads ~/VirtualBoxVMs ~/snap

When you ran restic the second time, it was:

~/AUFs ~/Downloads ~/snap ~/Applications

So restic opted for the safe default and re-read all data. That took a long time (~31 minutes), but afterwards restic decided that it only needed to add a little bit of data that was not in the repository yet.

You can check with restic snapshots, it’ll also print the list of target directories.

When you run restic now again with the same set of directories it’ll be much faster and it will also know how many files have changed.

Please be aware that the retention policies applied by restic forget will group snapshots according to the list of directories they contain, so right now you will end up with two groups (for the two lists of dirs). If you don’t want that, you need to tell restic to only group by hostname using --group-by host (the default is --group-by host,paths).

3 Likes