How does Restic deduplicate?

Hello

I’m a quite happy restic user (it is quite fast for my 700Gb of data :grinning:), but whilst using restic, I got two questions:
First of all:
Assuming I have the following structure: /media/HDD1/Pictures and /home/MelcomX/Pictures, where the latter is a symlink to the other. And assuming, that I do my backups with the following command. restic -r repo --verbose backup /home/MelcomX /media/HDD1. Everything works fine, but now I’m wondering, what would be if I change the symlink to a bindmount. Would restic see whilst descending the bindmount that this is a duplicate of /media/HDD1/Pictures or would it backup the same data twice?

And as a follow up Question(probably it is no longer necessary, depending on the answer of the premier question): According to the docs, restic does deduplication on blob level, but is that performed only per file(and their changes) or system wide?

Thanks for your answers in advance.

Hi, and welcome to the forum!

On the next run, restic will descend into the bindmount and read and hash all files in there. In the process, it’ll detect that all data has already been uploaded to the repo, so it’ll decide that there’s no need to upload anything. In the resulting snapshot, you’ll have the pictures in two locations (/media/HDD1/Pictures and /home/MelcomX/Pictures).

Restic does deduplication on blobs, per repository. A blob is usually only saved once in a repository.

I hope this answers your questions!

1 Like

Thank you very much! Restic is even cooler than I thought!

Yes this answers my questions!