How does deduplication work?

alexio · June 9, 2021, 1:13pm

Hello everyone, I’m wondering how deduplication works in Restic.

I have the following scenario: I use Restic for backing up my application on the production server with a daily schedule. I then clone the application on a staging environment (another server), make changes, delete some folders and add new ones. I also have a daily Restic schedule on the staging server using the same repository used for production, with different tags to distinguish production and staging backups.

My question is how much disk space will this scenario use on my S3 storage? Deduplication is based on last snapshot data or on the entire repository?

Thanks in advice for your help.

nicnab · June 9, 2021, 2:57pm

Check this out!

alexio · June 9, 2021, 3:58pm

Is this also true with a lot of small files? Like php, images, etc. If I understood correctly, deduplication doesn’t copy twice the same files if already been copied in the same repo before. Is this correct?

martinleben · June 9, 2021, 6:25pm

Yes it is.