This question is similar to this post. I want to ensure I had the correct understanding of the answer before acting.
For some reason, I want to move the content of a folder to a different folder. I am afraid restic would understand that the files from SOURCE directory was deleted and new files appeared in the TARGET directory.
If that happens, my backup location would use the double space needed because restic would not know that the files from the SOURCE directory is the same of the new TARGET directory.
My understanding of the answer in the post mentioned above, is that restic somehow would detect the files from SOURCE are the same from TARGET, so the duplication would not happens and only new and modified files will increase my storage backup size.
Is my understanding of the answer in that post correct? Can I move files and restic will take care by itself to verify the files are the same and will avoid deduplication as usual?
restic does not understand anything - it just checks is file content you are backing up is already in the repo or not? If yes it will save new metadata and that’s it. metadata in SOURCE will be updated. metadata in DESTINATION will be updated.
This is what content deduplication is about.
So long story short - you can move/copy your files around without any worry that they will be backed up twice (but metadata).
In fact “understand” was not a good word in this context. But this is a very interesting feature I was not expecting restic to have.
Thank you for your answer.
1 Like
My understanding is slightly different than yours. Restic will check if a file with the same directory and name and size (and perhaps last change time) has already been backed up. If not then it will compress, encrypt then try to store each chunk. If that chunk already exists in the repository then it does not need to store it again but will use a reference to the existing storage.
This means that if you move files from directory1 to directory2 on the next backup the files seem like they have not been backed up so restic will compress and encrypt then and only then will it find them in the repository and will not duplicate the space. This may matter if you have a lot of files with directory changes.
This is different to the other explanation because restic still does the compression and encryption for the moved files but does not store them. The backup after that will find the directory2 files in the repository and will not do more compression, encryption and storage.
1 Like
This is exactly what I meant:) Sorry for not being clear enough. No data will be duplicated. Only information about file name, its location etc. (metadata)
Thank you for your clarification. As I understand, the conclusion is that the first backup after moving the files will take time but not so much new space will be used.
Yes. It will take less or more time needed to read this file from your local disk + some tiny new data written to repository (metadata). So many times faster than if this file would be something entirely new.