Verifying restore

neclepsio · December 28, 2023, 12:19pm

Hi all.

I just restored my backup to a new hard disk, without using the --verify flag. I took many hours.

Now, I’d like to verify if the restore and the backup are the same (I had data corruption before, I think because of McAfee/Trellix that I cannot disable). I understood that using a new restore --verify overwrites the files I restored, and I’d like to avoid this, it would take much longer than just read restored data and compare to checksums in backup.

The only way I think I found is to backup using twice --verbose and --dry-run, then analysing the enormous log to find out if anything should have been wrote to the repository (I understand that even if metadata are the same, checksums of data are used to compare).

Is really this the best way?

alexweiss · December 28, 2023, 7:24pm

First, AFAIK, restic doesn’t support comparing local content with content of a snapshot - except directly after a restore by using the -verify option which you didn’t use.

So, the idea to mis-use the backup command is not bad in theory. The idea is to specify the snapshot as --parent (or let restic auto-select the right one) and check if backup identifies no change - that would mean that the snapshot is identical to the data on-disc.

This however, has several problems: First, the path of the local data has to be exactly fitting to the snapshot path (I don’t know if this matches in your setting…). Second, when running backup with a parent, files are assumed as unchanged if the file metadata (especially timestamps like mtime) matches the snapshot. That means a “successful” check doesn’t check if the file contents really do match. (And note, there is --force but this does unset the parent, so backup does read everything, but does no longer compare against an existing snapshot).

(As the author if rustic,) I recommend using rustic which can do the check. Both rustic diff --metadata <SNAPSHOT> /local/dir and rustic restore --verify-existing --dry-run <SNAPSHOT> /local/dir are able to verify that your local content is fine.

neclepsio · December 28, 2023, 9:20pm

Thank you! The rustic diff command is not applicable, because in the same directory I restored several snapshots. rustic restore, on the other hand, seems to hang in collecting file information phase.

Maybe I will try restic backup --force and filter the log for data added to the repository: if something is added, it means the content is different, I suppose.

neclepsio · December 28, 2023, 9:25pm

rustic diff --metadata <SNAPSHOT> /local/dir | grep '^[^+]' could to do the trick.

alexweiss · December 28, 2023, 9:30pm

Note that rustic diff is also able to diff subdirs: rustic diff <SNAPSHOT>:/subdir /local/dir/subdir. About rustic restore: Yes, during the collecting file information phase all existing file contents are read and checked. This will take some time for a lot existing files…

Right. But note that if nothing is added, this doesn’t mean at all that your local contents are correct. It just means that all blobs of you local contents are somehow present in the repository. For instance, if the file contents of two files are simply exchanged, backup will report “nothing added”. If all files are (wrongly) containing large areas of zero bytes, there is a high probability, that backup will report “nothing added”. There are other, more involved cases which could lead to “nothing added” but still mean your content is totally messed up…

sc2maha · December 29, 2023, 11:43am

I’ve been using a variant of this from my borg days. It relies on (a) the backup tool producing a tar stream and (b) GNU tar, which has a neat “–to-command” option.

The downside is this forks for each file, which may be a show stopper (it is, for one of my datasets, but not for most of them).

First, put this in a file, call it “tar-md5” or something, and put it in the PATH:

#!/bin/bash

md5=`md5sum`;
md5=${md5%% *}

printf "%s  %s\n" $md5  "$TAR_FILENAME"

Now, run restic dump with your choice of snapshotid and path, then pipe that to

| tar -xf - --to-command=tar-md5 > output.hashes.txt

Now run md5sum -c output-hashes.txt in the directory that you previously restored (or in my case, since I verify after a backup, the source directory)

alexweiss · December 29, 2023, 3:00pm

@sc2maha The downside of your self-made compare is that it has to read both the local data and all data from the snapshot (and this even quite inefficiently). In comparison to another restore the only benefit is that it doesn’t need scratch space, the downside is that another restore would be much faster than your dump. If you have restic mount available, using this and running the unix diff command would be even much easier and still better performance-vise.

The solutions I offered all use the fact that during the backup all data is already hashed (more precise: Cut into chunks and all chunks are hashed-9 and that using these stored hashes there is no need to retrieve any additional data from the repository. They are not only much faster while still not needing scratch space, but also don’t produce extra costs if your repository is in some Cloud with access fees. (Actually, they even work with cold storages without needing to warm-up the cold backup data…)

sc2maha · January 12, 2024, 8:53am

My restore method ensures restorability. With complex software, the fact that the tool has all the data within its files is not always a guarantee that some bug won’t cause a glitch. I need a way to know what I get if I restore using dump.

Side note: I don’t use only restic, I use other backup tools also. That md5sum of the source disk gets used more than once.

I respect your right, as the author of rustic (if I am reading the thread correctly) to proclaim yours as the best solution, but for critical functions like backup I have learned from past experience only to use tools that are in the repositories of at least 2 of the 3 major distros I use across my machines.

I’m sure rustic will get there, and at that point I will probably take another look.

PS: mount is often slower due to fuse on some of my machines, I have not had time to dig into this.

alexweiss · January 12, 2024, 1:26pm

@sc2maha I don’t want to proclaim a best solution; I just want to discuss advantages and disadvantages of different solutions such that everyone can choose what is the best solution for them. And if your solution works for you or others, great!

I fully agree to you that for backup you shouldn’t rely on just some backup software telling you the backup is fine - you have to ensure yourself, that it in fact is - by doing restore tests!

In this thread the topic was about verifying a already completed restore. And I think the setting was that the restored data is not available anywhere else (in that case a simple diff would have done the task to verify that restore.) - so we want to compare the restored data against the repository.
So here I actually think that - besides the discussed advantages of each solution - using a different tool than the one used to restore gives you additional assurance that the data matches the repository.