Stats bug, or what am I missing?

EDIT: Answering my own question, but leaving it here for now in case my answer is wrong or its presence is useful (mods, please delete if needed). My conclusion is that if you run restic stats without specifying a snapshot, it reports the total size needed to restore each and every snapshot in the repo. And this means you get a total “restore” size that is much larger than either the data stored in the repo or the data you would actually download if you restored just 1 (or a few) snapshots.

original post below

I saw some earlier similar posts, but they seem solved. Probably just basic incomprehension on my part, but it seems weird and I can’t solve it, so posting here?

This is a very simple testing case: I have a source folder with 5 PDFs in it, and I’ve backed them up to a repo a dozen times. The source dir is about 10Mb of data. When I check with the file system directly, the repo dir contains about 10Mb of data.

  • When I run restic stats in default mode (i.e. restore-size), it reports 125Mb.
  • When I run restic stats in raw-data mode, it reports 10Mb.

I’m confused because I assume the restore mode should, if anything, be smaller than the raw-data. And restore-mode (default) is reporting 10 times more data than is actually in the repo.

Here’s the output of stats run in both ways on this repo:

sam@Minuit ResticPlay % restic -r /Users/sam/sandbox/ResticPlay/repo stats --mode restore-size
enter password for repository: 
repository 08064a86 opened (version 2, compression level auto)
[0:00] 100.00%  9 / 9 index files loaded
scanning...
Stats in restore-size mode:
     Snapshots processed:  13
        Total File Count:  133
              Total Size:  124.041 MiB


sam@Minuit ResticPlay % restic -r /Users/sam/sandbox/ResticPlay/repo stats --mode raw-data    
enter password for repository: 
repository 08064a86 opened (version 2, compression level auto)
[0:00] 100.00%  9 / 9 index files loaded
scanning...
Stats in raw-data mode:
     Snapshots processed:  13
        Total Blob Count:  54
 Total Uncompressed Size:  10.370 MiB
              Total Size:  10.328 MiB
    Compression Progress:  100.00%
       Compression Ratio:  1.00x
Compression Space Saving:  0.40%

EDIT: just to add that both both of the other modes, blobs-per-file and files-by-contents also report the same exact 10.340MiB. Only restore-size is different.

1 Like

My conclusion is that if you run restic stats without specifying a snapshot, it reports the total size needed to restore each and every snapshot in the repo.

Yep, that’s also my understanding of how it works; if you don’t do any filtering you’re seeing the stats of everything contained in the repository. If you have a lot of snapshots in the repository, restoring all of them would require a lot of space!
The examples in the docs all show outputs with the snapshot identifier “latest” being specified, to only include the most recent snapshot:
https://restic.readthedocs.io/en/stable/manual_rest.html#getting-information-about-repository-data

As an additional data point for you, when running stats in raw-data mode without specifying a snapshot ID against my local repository, I get a result in the multi-terabyte range. This despite the backup dataset only being a few hundred GB :slight_smile:

2 Likes

If you don’t specify a snapshot, the stats are about all snapshots. Your files don’t seem to be compressible. I guess, your snapshots are very similar and about 10 MB each.

  • restore-size: 13 snapshots × 10 MB = 130 MB
  • raw-data: 10 MB which are used in all 13 snapshots = 10 MB
  • blobs-per-file: the total sum of all blobs of all files is again 10 MB
  • files-by-contents: the deduplicated size of all files is again 10 MB

It gets more interesting for single snapshots.

1 Like