I have few tens of GB of data in a restic repository. Every time I run
restic check —read-data
I find numerous data integrity errors of various sort. My understanding is that bit flips should be rare and I am not sure bit-flip could be the cause of these errors.
I ran “memtester” and “smartctl” and I don’t find hardware faults. I run restic 0.12.1, on Ubuntu 20.04. The client’s file system is ext4 with SSD, and destination file system is btrfs with RAID (so no chance of bit flip on remote), backend is SFTP. None uses ECC RAM.
Here are several type of errors frequently encountered:
-
Pack ID does not match, want 22e17e03, got 1f564ae7
-
Pack 70133809 contains 1 errors: [Blob ID does not match, want 743eae2c, got c23d20d1]
-
Error for tree 5eb7199f:pshots
invalid character ‘\r’ in string literal
Pack and blob hash mismatches (first two errors) are very common. Usually I get handful of them with few tens of GB. But strangely in a TB repository, I get over 30 of these type of errors (but this could be a special case, that I suspect is due to permission problems).
I have several questions.
- What are the usual cause of these errors?
My initial guess is: network interruptions, back up interruptions, background processes (such as Dropbox daemon, or browser) writing to file system while restic is backing up, permission problems preventing restic reading some files in source or destination making it to somehow output a misleading/unhelpful error message, problems with restic software, and bit flip due to faulty hardware.
My guess is that, these errors also occur with any data transfer tool, such as rsync, but they go unnoticed there. But I am not sure.
- How to debug these errors?
For each error, how can I find directory and files affected? So far, I am using the following comments.
restic -r repo find --pack 70133809
restic -r repo cat pack 70133809
restic -r repo cat blob 1e514aa8
restic -r repo find --show-pack-id --tree 80b7199
Playing with these commands, sometimes I find the affected file. Usually they are weird files, like dot files in Dropbox, files with very long names such as saved HTML files, and sometimes videos or MP3’s.
What’s the step by step way to debug restic errors?
It would be good to have a utility or guide to debug restic errors. You would run something like “restic find error PACK-ID” and it will provide useful information.