Checking local files integrity agains repo

My NAS just complained that the filesystem was unclean and it did a filesystem check. Now I want to know if some files might be broken. I have the NAS backed up with restic. Is it posible to compare checksums or something with the latest snapshot in my restic repo to identify possible broken files?

The following procedure may vary a bit from system to system, and there are lots of opportunities to optimize it (see xargs!), but hopefully it conveys the basic steps to you.

  1. Mount your restic respository. restic -r /path/to/repository mount /mnt/restic &
  2. Change directories into the latest snapshot. cd /mnt/restic/snapshots/latest
  3. Create a sha256sum checkfile. find . -type f -exec sha256sum {} \; | tee /tmp/checklist.
  4. Change directory out of the restic repository into the topmost directory where your backup begins. cd /
  5. Unmount the restic repository. umount /mnt/restic
  6. Check all SHA file hashes from the repo with your live filesystem. sha256sum -c /tmp/checklist

While @David 's solution works well, note that it needs to download all data from your repository to locally compute the SHA256 hashes of each file backup’ed. If your repository is remote this may take a long time and may be expensive.

An alternative would be to simply run backup --force (which re-reads, re-chunks and re-hashes every file) and then restic diff to compare the newly generated snapshot with any of the already existing snapshots. This only needs to read local files. However, it saves extra data in the repository if your local files are changed (e.g. corrupted).

2 Likes

Thanks, it’s about 20TB in the cloud so I think this is the route Ill take. I can always delete the snapshot afterwards.

Why not just diff -rq the two directories (source and mounted)?

The issue with that is that the diff command will download the full file contents from the repo, which is in many cases expensive. The solution offered by @alexweiss seems like the only available option that does not include downloading everything from repo. Essentially:

restic backup --force
restic diff -v <known_good_snapshot> <last_snapshot>

Note the -v to get a list of all new/modified/deleted files.

For some more thoughts on this, here is another thread discussing a similar problem, where the idea of getting checksums for files in the repo was also discussed. That turns out not to be possible without full downloads (with the current repo format at least). So the backup-then-diff solution is the best.