How to check if files were compressed?

Is there a way to check if files were compressed at a given compression level other than comparing the size of the repo before/after enabling compression? I set compression to max via environment variable before running my backup, and I want to make sure it actually applied.

Thanks!

1 Like

Yeah. I currently have exactly the same question. I have migrated a v1 repo to v2, set the “RESTIC_COMPRESSION” env to “max” and did a re-compress using the prune subcommand using the " --max-repack-size" option. I found no way to proof, if restic has really used the max compression setting, or if the compression was even used.

Would it be possible to extend “restic stats” to include information on compressed blocks? Method and compression level would be nice.

2 Likes

Interesting question. I am not aware of a way to see this, and honestly I don’t think restic stores information about which level of compression was used in the snapshots, but I could be wrong!

If it’s not simple or possible to get that information out of a stats command, what do you peeps think about restic simply outputting which compression level is used, during the backup run?

On a note regarding use case - why are you pondering this in the first place? Has there been an actual problem of some sort?

AFAIK, the compression level is merely a parameter how much “work” the zstd algorithm invests to produce the compressed data, whereas you cannot determine from the result (or by a decompression algorithm) how much work was invested.

restic does not store any information about the compression level. So this simply cannot be shown or determined (except by using the uncompressed data for each blob and trying to recompress using several compresseion levels).

However, the following information is available per blob:

  • whether the blob is compressed or no
  • the uncompressed size
  • the compressed size
1 Like

I’d love to be able to do a “restic stats --mode raw-data” and “restic stats --raw-data-uncompressed” or something to that effect, to be able to see the compression’s effect.

I’ll often tell my users when restic dedupes say, 500GB to 250GB, so they know they need to run a duplicate finder. But now the results are a little exaggerated, unless --mode raw-data actually tells the uncompressed size? I’m not quite sure how it works anymore.

Please have a look at restic stats: print uncompressed size in mode raw-data by plumbeo · Pull Request #3915 · restic/restic · GitHub . Does that PR add the information you’re interested in?

raw-data is the compressed size for a V2 repo.

1 Like

Ahh perfect, I’ll be following that one. :+1:

Cool, I thought so, but I wasn’t entirely sure. Thanks!

I think that for the stats sub-command, the output of uncompressed and compressed data is enough. For actual backup, re-packing etc. where the variables like “RESTIC_COMPRESSION” apply, it would be anyway good to have the ability to see if and what compression level got picked up by the command, at least if a verbose option is used. Same problem might apply to all ENV variables.

The use cases definitely are related to migration. The user want to see, if the compression level selected was correctly applied. I searched a little bit in the source code what actually gets stored in the repo. It seems that the used compression algorithm and level is not stored. So if an archive was re-packed with “auto”, it can not be automatically upgraded to “max” later. Same if in the future an additional compressor gets added, it will be difficult to support without repo changes. In most cases simple re-upload is easier anyways, so no big deal in the end.

Thanks for the work. The feature in general is really awesome and restic is now really the program of choice!

For me the output of a verbose run is okay:

Added to the repository: 7.960 GiB (2.877 GiB stored)