This error means that the data restic loaded from the sftp server has been modified. The file names in the repo always correspond to the SHA256 hash of its contents. When restic accesses a complete file, e.g. when loading an index, it first checks that the hash matches the filename. In this case the hash does not match, so it prints an error and aborts. The ID you see there (bb4c685762) is the beginning of the SHA256 hash. In our experience, this usually indicates a hardware problem, either the memory (of the server or the machine running restic) or, more likely, the storage medium.
Can you run sha256sum on the file index/bb4c685762* on the server? Or download the file to the local machine and run it there?
In general, the index files are just an optimization and restic is able to rebuild an index from scratch just by looking at the data files. In my opinion, the first priority for you should be to find out where the error comes from, before proceeding to repair the repo.
FreeBSD freenas 11.3-RELEASE-p6 FreeBSD 11.3-RELEASE-p6 #0 r325575+d5b100edfcb(HEAD): Fri Feb 21 18:53:26 UTC 2020 root@tnbuild02.tn.ixsystems.com:/freenas-releng/freenas/_BE/objs/freenas-releng/freenas/_BE/os/sys/FreeNAS.amd64 amd64
SSH-2.0-OpenSSH_8.0-hpn14v15
I may be worth noting that I have other restic repositories on this server and they’re working fine…it’s just this on in particular which is having issues.
@cdhowie hey…you appear to be on to something. I tried a check from my Macbook and it worked!
Since this appears to be a Linux bug, shall I assume you need no issue to be created?
Thanks so much for the help!
restic -r sftp:root@172.16.5.5:/mnt/Backups/gulf2 check
using temporary cache in /var/folders/w2/t5x94yvn4f1c6cr9fy081z140000gn/T/restic-check-cache-506008388
repository 8fb8e361 opened successfully, password is correct
created new cache in /var/folders/w2/t5x94yvn4f1c6cr9fy081z140000gn/T/restic-check-cache-506008388
create exclusive lock for repository
load indexes
pack d8ce1de6 contained in several indexes: {0f8b35f2 ee080a5d}
pack 6ffc3d1a contained in several indexes: {0f8b35f2 9b0628d5}
pack 5859804f contained in several indexes: {02f363ac 066cdd94}
This is non-critical, you can run `restic rebuild-index' to correct this
check all packs
check snapshots, trees and blobs
[4:42] 50.00% 1 / 2 snapshots
[4:43] 50.00% 1 / 2 snapshots
[4:43] 50.00% 1 / 2 snapshots
[4:43] 50.00% 1 / 2 snapshots
no errors were found snapshots
[5:50] 100.00% 2 / 2 snapshots
Yep, I don’t think we need a bug report. It’s not a restic issue, it’s likely that kernel issue which should be corrected when you upgrade to 5.4.2+. Definitely let us know what happens after you upgrade. Confirmation that it resolved the issue would be appreciated so we can mark the thread resolved.
Just so that you aware of it: the kernel you’re running can cause data loss. What you saw was just a bogus computation of a SHA256 hash, but it can be much more severe. The kernel causes memory corruption: Go issue, Kernel Bugtracker
It’s much more likely that Go programs are affected due to the way the Go runtime works, but it also happens with other programs. If I were you I’d try to upgrade as soon as possible, or (if that’s not possible) use a different machine for now.
I’ve a similar issue on CentOS7 running 3.10.0-1160.15.2.el7.x86_64. I upgraded the kernel to 5.4.104-1.el7.elrepo.x86_64 but I’m still having the same message.
I tried restoring from that repo on my macOS and it works.
restic version
restic 0.12.0 compiled with go1.15.8 on linux/amd64
uname -a
Linux mail.my.info 5.4.104-1.el7.elrepo.x86_64 #1 SMP Mon Mar 8 16:59:45 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
restic restore latest --target /tmp/restore
repository fe197b7a opened successfully, password is correct
Fatal: load <index/4de1b8a80b>: invalid data returned
restic rebuild-index
repository fe197b7a opened successfully, password is correct
loading indexes...
Fatal: load <index/4de1b8a80b>: invalid data returned
That looks like restic has cached a broken copy of an index file. Please run restic list index | grep 4de1b8a80b to get the full name of the index, then run restic cat --no-cache index 4de1b8a80b[...]. Does that produce a lot of output or just throw an error?
Please take a look at ~/.cache/restic/fe197b7a[...]/index/, that folder should contain a file whose name starts with 4de1b8a80b. Then compare the size of that file with the one in the repository. Afterwards move that file to somewhere outside the cache folder, but keep a copy of it for now. Then the restic commands should work again.
Picking this thread back up. As you recall, I was able to upgrade my kernel and it seems the problem went away. However, the data is so large I haven’t had the opportunity to try restoring the data yet. This morning, I decided to give that a shot. However, on all platforms, when I try to mount or rebuild-index, I get this:
$ restic -r sftp:root@172.16.5.5:/mnt/Backups/gulf2 rebuild-index
repository 8fb8e361 opened successfully, password is correct
loading indexes...
Fatal: load <index/bb4c685762>: invalid data returned
However, check succeeds:
$ restic -r sftp:root@172.16.5.5:/mnt/Backups/gulf2 check
using temporary cache in /tmp/restic-check-cache-612597033
enter password for repository:
repository 8fb8e361 opened successfully, password is correct
created new cache in /tmp/restic-check-cache-612597033
create exclusive lock for repository
load indexes
pack 6ffc3d1a contained in several indexes: {0f8b35f2 9b0628d5}
pack d8ce1de6 contained in several indexes: {0f8b35f2 ee080a5d}
pack 5859804f contained in several indexes: {02f363ac 066cdd94}
This is non-critical, you can run `restic rebuild-index' to correct this
check all packs
check snapshots, trees and blobs
[2:26] 100.00% 1 / 1 snapshots
no errors were found
It’s likely that your local cache (which restic check does not use) contains a broken version of the file. Can you look for a file with a name starting with bb4c685762 in the cache, e.g. like this:
$ ls -al ~/.cache/restic/*/index/bb4c685762*
Likely the file is truncated or empty. I’d either remove the file and try again, or just remove the whole cache directory (~/.cache/restic). Restic will just rebuild the local cache as needed.
I get the same issue after force exit the prune stage (cause i realized i made a mistake and too much would be pruned). The checksum on the (sftp) server of 4 index files is now invalid. Is it expected that killing prune will corrupt index files? I would think it uses temp files and rename them afterwards. Anyway, is there some way to fix this or should i start with a clean backup? Fortunately not a huge problem if not possible, but was wondering what will happen if somehow the prune stage gets killed under abnormal circumstances (eg. OOM or power outage), should be safe?
Yes I’m using 0.12.0. I’ve managed to fix this by restoring the snapshots and index dir from a ZFS snapshot, which fortunately is accessible through SFTP and is much faster to down/upload than the whole restic data dir (stupid SFTP doesn’t support renaming/moving and this is the only way i can access this backup target). Prune and check worked afterwards and just lost a single day, so it was an interesting experiment as well. Wouldn’t have been a disaster if it didn’t work.
Still somewhat concerning it got corrupted by simply interrupting the prune stage.