Trying to understand compression and stats

Hello everyone. I continue learning and using Restic. :grinning:

Running restic 0.15.2 compiled with go1.20.3 on windows/amd64 on Windows.

I am making a daily backup of a folder of about 50 gigabytes. In this folder, in turn, an accounting program adds its own backup, and curiously, that daily backup weighs about 1 gigabyte.
In other words, for each day a 1 gig file is added which I assume has very little difference with the file from the previous day. I say this because it is an accounting program, so it only saves entries with numerical and alphanumeric information. They are not images or anything.

I don’t understand why an accounting program backs up so much information but it doesn’t depend on me so I leave it as is.

In the rest of the folder there are the usual things, documents, spreadsheets, images.

I am running restic with a command like:

restic -r rclone:backup:server --verbose backup --use-fs-snapshot --files-from "resticfilestobackup.txt" --password-file "resticpassword.txt"

That is, with standard compression (level auto) I suppose.

Running:

restic -r rclone:backup:server stats

repository c37257f6 opened (version 2, compression level auto)
scanning…
Stats in restore-size mode:
Snapshots processed: 23
Total File Count: 4038383
Total Size: 4.387 TiB

and running

restic -r rclone:backup:server stats --mode raw-data

repository c37257f6 opened (version 2, compression level auto)
scanning…
Stats in raw-data mode:
Snapshots processed: 23
Total Blob Count: 104666
Total Uncompressed Size: 74.170 GiB
Total Size: 31.104 GiB
Compression Progress: 100.00%
Compression Ratio: 2.38x
Compression Space Saving: 58.06%

1) If I interpret the reports correctly in the first one, it tells me that if I download and decompress the entire backup it would weigh about 4 TB.???

And in the second report it tells me that the backup in the repository weighs only about 31 gigabytes with a Compression Ratio: 2.38x
(The repo is on Google Drive by the way)

2) Could it be that since there are many 1 gigabyte files that have little difference between one and another, restic achieves those very high compression rates?

  1. Yes. That is if you download and decompress all snapshots. The ratio is 2.38x as 74.170 GiB are compressed to 31.104 GiB.

  2. You have 23 snapshots so on average, each snapshot takes 4 TB / 23 = 174 GB. But apparently those files are partially identical because restic only has to store 74 GiB.

Simplified example: if your files have a size of 100 GB and you take 4 snapshots, the restored size is 400 GB, the uncompressed size is perhaps 80 GB (down from 100 GB, because some file contents are similar and the snapshots are very similar) and the compressed size is 40 GB with a compression ratio of 2 (=80/40).

1 Like

Thanks a lot @noeck !

The more I learn how to use Restic and how I can compare it with other alternatives, the more I like it.

1 Like