Restic backup to Backblaze B2 container much lower than backed up volume size

Hello,

I’m quite new to Restic and just recently started to use it (to backup my simple Raspberry PI NAS).
Regarding Restic everything works like a charm and the installation process was easy.

I do have one question. I am backing up approx 2TB. Restic is running for a few days now (daily backup to B2). But when I look at the B2 Bucket size it’s only 746.8 GB in size vs the ~2TB I expected it to be.

I have narrowed down that Restic isn’t able to backup the Timemachine folders, but that seems something I should be able to fix (permission error). Most other data comes from a Windows (10) backup to the NAS, which should backup fine (no errors).

Excluding the Timemachine folders from the total backup this still leaves +/- 1.5 TB of data I would expect to be send to B2 vs the 746.8 GB in the bucket. And I’m kind a lost why this is the case, does anyone has tips on how to proceed?

Deduplication might be playing a role. Afaik it will be effect, even for the initial backup, since it’s block-based.

You can use restic mount to browse repository or restic ls ${a_snapshot_id} to list the files inside a snapshot and approve if the files you’d like to see is in.
Or even run restic check to check if there is a problem on the repo internally.

1 Like

You could also run restic stats <snapshotid> which will calculate the size of the restored files. That should give you a pretty good idea how much data is contained in the backup.

2 Likes

Thanks Gurkan and MichaelEischer both for your reply. I have worked out what is happening and all is well. First there where some minor issues found by running restic check (index rebuild solved this).

I also checked the last snapshot with the stats function and compared to the local files. The 890GB backup will result in a ~2TB restore which is correct.

I clearly underestimated the powerful deduplication of Restic.
As I am in the process of transferring my backups to the new RP NAS there are loads of backup files temporarily on my regular PC with are of course duplicates (and duplicates of duplicates). I did not expect Restic to dedupe also the files from different folders/disks/within Windows 10 backup folder but it is obviously doing this (which is great and is saving me quite some Backblaze bandwith and storage). Windows 10 backup is also a fairly simple folder structure (vs Timemachine on Mac) so I guess this helps with the deduplication. Couldn’t be happier with Restic!

Thanks again for your help and support!

2 Likes

I’m curious: Which type of issues did check report? Ideally a index rebuild should only be necessary if the stored repository data was damaged.

The deduplication in restic works by splitting files into small chunks for which restic then calculates a hash (fingerprint) and then uses that hash to skip file chunks which were already backed up. The file path is completely irrelevant for that step.

And the really cool thing about this is it means deduplication can work intra-file, not just inter-file. If a file contains large duplicate sections (think VM images with long runs of zeroes, for example) restic can even deduplicate part of a file against another part of that same file.

I’m curious: Which type of issues did check report? Ideally a index rebuild should only be necessary if the stored repository data was damaged.

Restic didn’t report exactly what the problem was only that the index needed to be rebuild. So no idea what exactly happened, but I expect that is was user error when setting up Restic and figuring our how to get it to work.

And the really cool thing about this is it means deduplication can work intra-file, not just inter-file. If a file contains large duplicate sections (think VM images with long runs of zeroes, for example) restic can even deduplicate part of a file against another part of that same file.

Yes, that works really well.

restic probably complained about pack files being listed in multiple indexes. Which restic version are you using? I’d expect that the current restic version 0.10.0 can no longer trigger that warning.