Let’s say I’m backing up my family photos. I’m not going to look at them often and they’re unlikely to change frequently (occasionally I might update the metadata to make them easier to mange) but I do want to keep them long term.
When I take new photos I put them in the photos directory on my computer, the photos directory is automatically periodically backed up using restic.
So far so good, I have the backup.
My question is, what if there are bad sectors on the HDD on my computer and some of the photos become corrupted?
Presumably restic sees the file has changed and takes a snapshot of it.
This isn’t an issue because restic still has the snapshot of the original version.
However, if I’m keeping, for example, a year’s worth of snapshots, if I don’t notice the corruption within a year, the corrupted version of the photo is now the only version in the backup!
How can I avoid this without keeping infinite snapshots? Can restic detect/warn when this happens?
So my question are are:
Is my hypothesis here correct? Will restic copy the corrupted file into the backup (Presumably yes? unless it only looks at mtimes to decide what’s changed?)
Is there any way of avoiding this?
Can restic detect this or do I need some further automation to look for this kind of problem?
My thinking is:
Read the mtime of the file in the snapshot
Read the mtime of the file on the disk
Checksum the file in the snapshot
Checksum the file on the disk
If the mtimes are the same but the checksum is different display a warning that something might have become corrupted
Is there any way to get restic to do this? Does it do it already?
When it has a parent snapshot, restic uses a combination of multiple pieces of metadata to detect if the file should be re-hashed. If none of that metadata has changed, restic skips the file.
My suspicion is that restic would not rehash the file in this specific case since none of the metadata would have changed. The corrupt data would therefore not be added to the repository.
mkdir restictest
cd restictest
mkdir backup
mkdir source
# create a file with random contents
cat /dev/urandom | head -c 120000 > ./source/original
# Set specific atime/mtime so it can be set to this again after modification
touch -d '2 Dec 2019 15:00:00.00' ./source/original
restic init --repo ./backup
restic -r ./backup backup ./source
# update the file contents
cat /dev/urandom | head -c 120000 > ./source/original
# reset the modification/access time so the file looks the same
touch -d '2 Dec 2019 15:00:00.00' ./source/original
# create the second snapshot
restic -r ./backup backup ./source
output:
repository df6c50bc opened successfully, password is correct
Files: 0 new, 1 changed, 0 unmodified
Dirs: 0 new, 0 changed, 0 unmodified
Added to the repo: 117.539 KiB
processed 1 files, 117.188 KiB in 0:00
snapshot 2d569e4e saved
So it is storing the updated file.
I realize that literally every byte in the file is being changed here but the updated version is being stored in the backup despite both files being the same size and mtime.
It would be useful to know exactly what restic does to calculate whether a file should be updated or not.
My test does not show the same result after dumping random data into the original file and resetting the mtime:
$ restic -r repo/ backup dir
enter password for repository:
repository fe47b2e5 opened successfully, password is correct
Files: 0 new, 0 changed, 1 unmodified
Dirs: 0 new, 0 changed, 0 unmodified
Added to the repo: 0 B
processed 1 files, 117.188 KiB in 0:00
snapshot de7ca469 saved
There is something else happening here. Can you run the test again, running ls -li ./source/original after setting the mtime, and showing the complete transcript of the test? Can you also confirm the operating system / distribution as well as the filesystem holding the source data, and the mount options for that volume?
The behavior is different because of a bugfix in 0.9.6 around Excel resetting mtime and therefore restic not noticing that a file has changed. Ctime is checked in 0.9.6 but was not in 0.9.5.
However, keep in mind that corruption of the file contents will not see the mtime nor ctime changed – unless that’s what was corrupted, and in that case the file’s inode is probably damaged and you’ll get errors from the filesystem driver in the kernel log as well as I/O errors returned to restic.
One possible solution would be to have the system mail you a diff of the new snapshot to the prior one after each backup. This is the basic command you’d use (modify the snapshots invocation with --host, --path, and/or --tag as required):
Thank you. Yes, this is the behaviour I had hoped would happen as if the contents are corrupted then they are not copied to the backup. As you say, metadata corruption should be a lot more obvious.
I just wanted clarification that this was the case and a better understanding of how restic works behind the scenes.
This is a really nice idea just for a bit of extra peace of mind, thanks!