Redownload Pack on Failed Verification

--read-data should bail on corrupted / incomplete packs much earlier: Each pack file is downloaded using repository.DownloadAndHash which verifies the hash of the pack file before even checking the individual blobs. As the data verifies to be correct when check is run on the server, this looks like the packs get somehow messed up on the client between DownloadAndHash and the ciphertext verification.

@allan: Does the error just show up randomly? The error log looks a bit like it might be the last blob in the pack that is damaged. Could you check with restic find --pack <packID> | grep "Found blob" | sort -u | wc -l how many blobs are in those packs (or rather are still referenced, I didn’t find a better command).

Is it possible that there was a concurrent backup process for which the lock file was deleted, or check was run with --no-lock ? This is what I’d expect to see if a pack file currently being uploaded is checked – it would appear truncated to the downloader.

It is not possible as no one else has access to this server. I also do not have another client backing up to it. The repo is dedicated to this client.

I hope to run another check --read-data tomorrow before and after the next backup.

Any chance the system running check has bad memory? Can you try running check on another system to eliminate that possibility?

Like MichaelEischer mentioned, blob checksums are only verified after pack file overall checksum was confirmed to be correct, which makes intermittent blob checksum validate errors all but impossible.

1 Like

Here is what I came up with:

$ while read pack;do echo -n "$pack = ";RESTIC_PASSWORD="foo" ./restic -r restic-repo --json find --pack $pack|jq 'unique_by(.id)|length';done < packlist

a342dce7 = 2
415064a5 = 3
de1b2836 = 3
7d56326d = 2
e57d8639 = 2
0248d350 = 3
f921b630 = 52
107d5652 = 3
8f6b23b3 = 2
41a04276 = 2

It looks like it was the last blob in packs a342dce7, de1b2836, 7d56326d, and 0248d350.

Also, the repo has not changed since the check --read-data as I haven’t made a backup. These blobs are the same as what was checked by the client.

Hmm, for at least 415064a5 and f921b630 it seems to be some blob somewhere in the middle of the pack. So I’m rather curious what the next check runs will return.

I currently have a check run started from another laptop on the network for @ifedorenko. It would be tomorrow until my laptop is back on the network to try from the backup client, unless I try it through a VPN connection. :grinning:

It is possible. Here is the output from another laptop on the network.

$ RESTIC_REPOSITORY="sftp:allan@server:restic-repo" RESTIC_PASSWORD="foo" ./restic check --read-data -vvv

using temporary cache in /tmp/restic-check-cache-099731024
repository aba320ff opened successfully, password is correct
created new cache in /tmp/restic-check-cache-099731024
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
read all data
[3:46:19] 100.00% 95615 / 95615 items
duration: 3:46:19
no errors were found

I ran a check --read-data first thing this morning and got no errors. I then ran a backup and got errors when a check was done immediately after. This time, I had debug turned on so we have a log.

@MichaelEischer may be onto something…the pack hashes match. I also checked the hash of the failed blobs and they match as well (at least the first 3 packs).

Could this be a bug?

Output From check --read-data

debug log file /home/allan/restic-debug.log
debug enabled
using temporary cache in /var/lib/restic/restic-check-cache-213960997
repository aba320ff opened successfully, password is correct
created new cache in /var/lib/restic/restic-check-cache-213960997
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
read all data
pack 86e38412 contains 1 errors: [blob 5: ciphertext verification failed]
pack 4b980e74 contains 1 errors: [blob 1: ciphertext verification failed]
pack 632ea1f2 contains 1 errors: [blob 2: ciphertext verification failed]
pack 2fee0db9 contains 1 errors: [blob 0: ciphertext verification failed]
pack a10b5806 contains 1 errors: [blob 1: ciphertext verification failed]
pack c944fa79 contains 1 errors: [blob 2: ciphertext verification failed]
pack 36be7659 contains 1 errors: [blob 2: ciphertext verification failed]
pack cf4866ab contains 1 errors: [blob 0: ciphertext verification failed]
pack 92354222 contains 1 errors: [blob 1: ciphertext verification failed]
pack 972130ab contains 1 errors: [blob 0: ciphertext verification failed]
[1:43:09] 100.00%  95916 / 95916 items
duration: 1:43:09
Fatal: repository contains errors

LOGS FOR FIRST 3 PACKS

Pack 86e38412

2020/02/03 14:25:44 checker/checker.go:700      checker.checkPack       77      checking pack 86e384127bc10ba3df8477e446a310c843955b12f129428316f50ceac0a9e92d
2020/02/03 14:25:44 cache/backend.go:196        cache.(*Backend).Load   77      Load(<data/86e384127b>, 0, 0): delegating to backend
2020/02/03 14:25:44 sftp/sftp.go:344    sftp.(*SFTP).openReader 77      Load <data/86e384127b>, length 0, offset 0
2020/02/03 14:25:44 checker/checker.go:713      checker.checkPack       77      hash for pack 86e384127bc10ba3df8477e446a310c843955b12f129428316f50ceac0a9e92d is 86e384127bc10ba3df8477e446a310c843955b12f129428316f50ceac0a9e92d
2020/02/03 14:25:44 pack/pack.go:239    pack.readHeader 77      size: 6511196
2020/02/03 14:25:44 pack/pack.go:208    pack.readRecords        77      header length: 254
2020/02/03 14:25:44 checker/checker.go:728      checker.checkPack       77        check blob 0: <Blob (data) 6f29cd51, offset 0, length 618882>
2020/02/03 14:25:44 checker/checker.go:728      checker.checkPack       77        check blob 1: <Blob (data) 1ca80402, offset 618882, length 525090>
2020/02/03 14:25:44 checker/checker.go:728      checker.checkPack       77        check blob 2: <Blob (data) decba22c, offset 1143972, length 524320>
2020/02/03 14:25:44 checker/checker.go:728      checker.checkPack       77        check blob 3: <Blob (data) 2ccc26fe, offset 1668292, length 1259095>
2020/02/03 14:25:44 checker/checker.go:728      checker.checkPack       77        check blob 4: <Blob (data) 7af1ccca, offset 2927387, length 641774>
2020/02/03 14:25:44 checker/checker.go:728      checker.checkPack       77        check blob 5: <Blob (data) 95432132, offset 3569161, length 2941777>
2020/02/03 14:25:44 checker/checker.go:751      checker.checkPack       77        error decrypting blob 954321329a61e715a4e1e5f362755df9e102e98d3a1d14d7920819fc01e22372: ciphertext verification failed

$ RESTIC_REPOSITORY="sftp:allan@server:restic-repo" RESTIC_PASSWORD="foo" restic cat blob 954321329a61e715a4e1e5f362755df9e102e98d3a1d14d7920819fc01e22372 | sha256sum
954321329a61e715a4e1e5f362755df9e102e98d3a1d14d7920819fc01e22372  -

Pack 4b980e74

2020/02/03 14:25:44 checker/checker.go:700      checker.checkPack       76      checking pack 4b980e74d2c6a7f55425e053ec2a2eb78bf96dd2d03e8497f834ce147c4ac9da
2020/02/03 14:25:44 cache/backend.go:196        cache.(*Backend).Load   76      Load(<data/4b980e74d2>, 0, 0): delegating to backend
2020/02/03 14:25:44 sftp/sftp.go:344    sftp.(*SFTP).openReader 76      Load <data/4b980e74d2>, length 0, offset 0
2020/02/03 14:25:45 checker/checker.go:713      checker.checkPack       76      hash for pack 4b980e74d2c6a7f55425e053ec2a2eb78bf96dd2d03e8497f834ce147c4ac9da is 4b980e74d2c6a7f55425e053ec2a2eb78bf96dd2d03e8497f834ce147c4ac9da
2020/02/03 14:25:45 pack/pack.go:239    pack.readHeader 76      size: 8471756
2020/02/03 14:25:45 pack/pack.go:208    pack.readRecords        76      header length: 106
2020/02/03 14:25:45 checker/checker.go:728      checker.checkPack       76        check blob 0: <Blob (data) 13d111dc, offset 0, length 1951135>
2020/02/03 14:25:45 checker/checker.go:728      checker.checkPack       76        check blob 1: <Blob (data) be6d1d30, offset 1951135, length 6520511>
2020/02/03 14:25:45 checker/checker.go:751      checker.checkPack       76        error decrypting blob be6d1d30fe2cbc0931a082ee492e9162d35b1cf798ade0ce0340779b7fa04edc: ciphertext verification failed


$ RESTIC_REPOSITORY="sftp:allan@server:restic-repo" RESTIC_PASSWORD="foo" restic cat blob be6d1d30fe2cbc0931a082ee492e9162d35b1cf798ade0ce0340779b7fa04edc | sha256sum
be6d1d30fe2cbc0931a082ee492e9162d35b1cf798ade0ce0340779b7fa04edc  -

Pack 632ea1f2

2020/02/03 14:26:10 checker/checker.go:700      checker.checkPack       78      checking pack 632ea1f26c12ecc7e026719dc6a1102061d9486c498d11bbee6ce562d52f21cb
2020/02/03 14:26:10 cache/backend.go:196        cache.(*Backend).Load   78      Load(<data/632ea1f26c>, 0, 0): delegating to backend
2020/02/03 14:26:10 sftp/sftp.go:344    sftp.(*SFTP).openReader 78      Load <data/632ea1f26c>, length 0, offset 0
2020/02/03 14:26:11 checker/checker.go:713      checker.checkPack       78      hash for pack 632ea1f26c12ecc7e026719dc6a1102061d9486c498d11bbee6ce562d52f21cb is 632ea1f26c12ecc7e026719dc6a1102061d9486c498d11bbee6ce562d52f21cb
2020/02/03 14:26:11 pack/pack.go:239    pack.readHeader 78      size: 7795453
2020/02/03 14:26:11 pack/pack.go:208    pack.readRecords        78      header length: 143
2020/02/03 14:26:11 checker/checker.go:728      checker.checkPack       78        check blob 0: <Blob (data) f58ac7ca, offset 0, length 990379>
2020/02/03 14:26:11 checker/checker.go:728      checker.checkPack       78        check blob 1: <Blob (data) fc292485, offset 990379, length 687993>
2020/02/03 14:26:11 checker/checker.go:728      checker.checkPack       78        check blob 2: <Blob (data) 5032b619, offset 1678372, length 6116934>
2020/02/03 14:26:11 checker/checker.go:751      checker.checkPack       78        error decrypting blob 5032b619424a7459a36aaeca386fe41bb49abaf4cc9b689e24fd5dd35fa8dc60: ciphertext verification failed

$ RESTIC_REPOSITORY="sftp:allan@server:restic-repo" RESTIC_PASSWORD="foo" restic cat blob 5032b619424a7459a36aaeca386fe41bb49abaf4cc9b689e24fd5dd35fa8dc60 | sha256sum
5032b619424a7459a36aaeca386fe41bb49abaf4cc9b689e24fd5dd35fa8dc60  -

This sounds worrying, could you run the check command a second time to make sure it is not some sort of temporary failure?

Here are the log entries for the same 3 packs from today’s run. All checked out fine.

Output From check --read-data

debug log file /home/allan/restic-debug.log
debug enabled
using temporary cache in /var/lib/restic/restic-check-cache-290365165
repository aba320ff opened successfully, password is correct
created new cache in /var/lib/restic/restic-check-cache-290365165
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
read all data
[1:29:45] 100.00%  95916 / 95916 items
duration: 1:29:45
no errors were found

Pack 86e38412

2020/02/05 10:47:14 checker/checker.go:700      checker.checkPack       95      checking pack 86e384127bc10ba3df8477e446a310c843955b12f129428316f50ceac0a9e92d
2020/02/05 10:47:14 cache/backend.go:196        cache.(*Backend).Load   95      Load(<data/86e384127b>, 0, 0): delegating to backend
2020/02/05 10:47:14 sftp/sftp.go:344    sftp.(*SFTP).openReader 95      Load <data/86e384127b>, length 0, offset 0
2020/02/05 10:47:14 checker/checker.go:713      checker.checkPack       95      hash for pack 86e384127bc10ba3df8477e446a310c843955b12f129428316f50ceac0a9e92d is 86e384127bc10ba3df8477e446a310c843955b12f129428316f50ceac0a9e92d
2020/02/05 10:47:14 pack/pack.go:239    pack.readHeader 95      size: 6511196
2020/02/05 10:47:14 pack/pack.go:208    pack.readRecords        95      header length: 254
2020/02/05 10:47:14 checker/checker.go:728      checker.checkPack       95        check blob 0: <Blob (data) 6f29cd51, offset 0, length 618882>
2020/02/05 10:47:14 checker/checker.go:728      checker.checkPack       95        check blob 1: <Blob (data) 1ca80402, offset 618882, length 525090>
2020/02/05 10:47:14 checker/checker.go:728      checker.checkPack       95        check blob 2: <Blob (data) decba22c, offset 1143972, length 524320>
2020/02/05 10:47:14 checker/checker.go:728      checker.checkPack       95        check blob 3: <Blob (data) 2ccc26fe, offset 1668292, length 1259095>
2020/02/05 10:47:15 checker/checker.go:728      checker.checkPack       95        check blob 4: <Blob (data) 7af1ccca, offset 2927387, length 641774>
2020/02/05 10:47:15 checker/checker.go:728      checker.checkPack       95        check blob 5: <Blob (data) 95432132, offset 3569161, length 2941777>

Pack 4b980e74

2020/02/05 10:11:58 checker/checker.go:700      checker.checkPack       93      checking pack 4b980e74d2c6a7f55425e053ec2a2eb78bf96dd2d03e8497f834ce147c4ac9da
2020/02/05 10:11:58 cache/backend.go:196        cache.(*Backend).Load   93      Load(<data/4b980e74d2>, 0, 0): delegating to backend
2020/02/05 10:11:58 sftp/sftp.go:344    sftp.(*SFTP).openReader 93      Load <data/4b980e74d2>, length 0, offset 0
2020/02/05 10:11:59 checker/checker.go:713      checker.checkPack       93      hash for pack 4b980e74d2c6a7f55425e053ec2a2eb78bf96dd2d03e8497f834ce147c4ac9da is 4b980e74d2c6a7f55425e053ec2a2eb78bf96dd2d03e8497f834ce147c4ac9da
2020/02/05 10:11:59 pack/pack.go:239    pack.readHeader 93      size: 8471756
2020/02/05 10:11:59 pack/pack.go:208    pack.readRecords        93      header length: 106
2020/02/05 10:11:59 checker/checker.go:728      checker.checkPack       93        check blob 0: <Blob (data) 13d111dc, offset 0, length 1951135>
2020/02/05 10:11:59 checker/checker.go:728      checker.checkPack       93        check blob 1: <Blob (data) be6d1d30, offset 1951135, length 6520511>

Pack 632ea1f2

2020/02/05 09:29:55 checker/checker.go:700      checker.checkPack       95      checking pack 632ea1f26c12ecc7e026719dc6a1102061d9486c498d11bbee6ce562d52f21cb
2020/02/05 09:29:55 cache/backend.go:196        cache.(*Backend).Load   95      Load(<data/632ea1f26c>, 0, 0): delegating to backend
2020/02/05 09:29:55 sftp/sftp.go:344    sftp.(*SFTP).openReader 95      Load <data/632ea1f26c>, length 0, offset 0
2020/02/05 09:29:55 checker/checker.go:713      checker.checkPack       95      hash for pack 632ea1f26c12ecc7e026719dc6a1102061d9486c498d11bbee6ce562d52f21cb is 632ea1f26c12ecc7e026719dc6a1102061d9486c498d11bbee6ce562d52f21cb
2020/02/05 09:29:55 pack/pack.go:239    pack.readHeader 95      size: 7795453
2020/02/05 09:29:55 pack/pack.go:208    pack.readRecords        95      header length: 143
2020/02/05 09:29:55 checker/checker.go:728      checker.checkPack       95        check blob 0: <Blob (data) f58ac7ca, offset 0, length 990379>
2020/02/05 09:29:55 checker/checker.go:728      checker.checkPack       95        check blob 1: <Blob (data) fc292485, offset 990379, length 687993>
2020/02/05 09:29:55 checker/checker.go:728      checker.checkPack       95        check blob 2: <Blob (data) 5032b619, offset 1678372, length 6116934>

I also noticed the problem packs were all checked within 1 minute.

2020/02/03 14:25:44 checker/checker.go:700	checker.checkPack	77	checking pack 86e384127bc10ba3df8477e446a310c843955b12f129428316f50ceac0a9e92d
2020/02/03 14:25:44 checker/checker.go:700	checker.checkPack	76	checking pack 4b980e74d2c6a7f55425e053ec2a2eb78bf96dd2d03e8497f834ce147c4ac9da
2020/02/03 14:26:10 checker/checker.go:700	checker.checkPack	78	checking pack 632ea1f26c12ecc7e026719dc6a1102061d9486c498d11bbee6ce562d52f21cb
2020/02/03 14:26:13 checker/checker.go:700	checker.checkPack	75	checking pack 2fee0db9d4c22d88b9a106773e7954afa1f8a7b7183f1aae6a4c5ee6ed967ac3
2020/02/03 14:26:17 checker/checker.go:700	checker.checkPack	78	checking pack a10b5806634f558280510e409fd0008df763de2f19f36d992f407ae64064e8a0
2020/02/03 14:26:18 checker/checker.go:700	checker.checkPack	78	checking pack c944fa79bc16c1eeab700d2ee934b0c44a36b2740cb75409cdf3762c2936a5cb
2020/02/03 14:26:21 checker/checker.go:700	checker.checkPack	78	checking pack 36be7659858ee77f2b51cfdd3ea1d12a0a77e73912459ae77eb1b112d1288073
2020/02/03 14:26:22 checker/checker.go:700	checker.checkPack	75	checking pack cf4866ab3c371c8473e112515da6b74b38482a14434d5429055d7e1e8818dcf0
2020/02/03 14:26:29 checker/checker.go:700	checker.checkPack	74	checking pack 92354222f6590d4a04b9d1fc4cba5db4dfef61586b0ff801ce0f421f1ab06023
2020/02/03 14:26:36 checker/checker.go:700	checker.checkPack	74	checking pack 972130ab8e41b26069543eb53a7125042a4cf15be915aa69eaac6094a3afdbd3

Hmm, the second log claims that the pack files are fine. In my opinion this leaves two possibilities where the check error might come from: Either your system has some bit flips when under high load or reading the pack file from the temporary file after download messes things up. The latter seems to be rather unlikely to me as the reads should be served from the page cache. Could you use something like prime95 to stress test your system?

The checkPack implementation in restic looks like straight forward single-threaded code to me and when trying the go race detector I didn’t get any warnings. That seems to rule out race conditions as the error source.

I’d also recommend running memtest86+ as memory errors could also cause these kinds of check errors.

1 Like

This is the output from a 3-hour run of mprime. There were no errors during that time and the CPU throttled down as expected.

[Wed Feb  5 19:49:56 2020]
Self-test 140K passed!
Self-test 140K passed!
Self-test 140K passed!
Self-test 140K passed!
Self-test 140K passed!
Self-test 140K passed!
Self-test 140K passed!
Self-test 140K passed!
[Wed Feb  5 19:57:45 2020]
Self-test 4K passed!
Self-test 4K passed!
Self-test 4K passed!
Self-test 4K passed!
Self-test 4K passed!
Self-test 4K passed!
Self-test 4K passed!
Self-test 4K passed!
[Wed Feb  5 20:05:22 2020]
Self-test 144K passed!
Self-test 144K passed!
Self-test 144K passed!
Self-test 144K passed!
Self-test 144K passed!
Self-test 144K passed!
Self-test 144K passed!
Self-test 144K passed!
[Wed Feb  5 20:14:22 2020]
Self-test 160K passed!
Self-test 160K passed!
Self-test 160K passed!
Self-test 160K passed!
Self-test 160K passed!
Self-test 160K passed!
Self-test 160K passed!
Self-test 160K passed!
[Wed Feb  5 20:23:02 2020]
Self-test 5K passed!
Self-test 5K passed!
Self-test 5K passed!
Self-test 5K passed!
Self-test 5K passed!
Self-test 5K passed!
Self-test 5K passed!
Self-test 5K passed!
[Wed Feb  5 20:30:32 2020]
Self-test 168K passed!
Self-test 168K passed!
Self-test 168K passed!
Self-test 168K passed!
Self-test 168K passed!
Self-test 168K passed!
Self-test 168K passed!
Self-test 168K passed!
[Wed Feb  5 20:39:26 2020]
Self-test 192K passed!
Self-test 192K passed!
Self-test 192K passed!
Self-test 192K passed!
Self-test 192K passed!
Self-test 192K passed!
Self-test 192K passed!
Self-test 192K passed!
[Wed Feb  5 20:48:18 2020]
Self-test 6K passed!
Self-test 6K passed!
Self-test 6K passed!
Self-test 6K passed!
Self-test 6K passed!
Self-test 6K passed!
Self-test 6K passed!
Self-test 6K passed!
[Wed Feb  5 20:55:50 2020]
Self-test 200K passed!
Self-test 200K passed!
Self-test 200K passed!
Self-test 200K passed!
Self-test 200K passed!
Self-test 200K passed!
Self-test 200K passed!
Self-test 200K passed!
[Wed Feb  5 21:04:44 2020]
Self-test 224K passed!
Self-test 224K passed!
Self-test 224K passed!
Self-test 224K passed!
Self-test 224K passed!
Self-test 224K passed!
Self-test 224K passed!
Self-test 224K passed!
[Wed Feb  5 21:12:15 2020]
Self-test 8K passed!
Self-test 8K passed!
Self-test 8K passed!
Self-test 8K passed!
Self-test 8K passed!
Self-test 8K passed!
Self-test 8K passed!
Self-test 8K passed!
[Wed Feb  5 21:20:03 2020]
Self-test 240K passed!
Self-test 240K passed!
Self-test 240K passed!
Self-test 240K passed!
Self-test 240K passed!
Self-test 240K passed!
Self-test 240K passed!
Self-test 240K passed!
[Wed Feb  5 21:29:15 2020]
Self-test 256K passed!
Self-test 256K passed!
Self-test 256K passed!
Self-test 256K passed!
Self-test 256K passed!
Self-test 256K passed!
Self-test 256K passed!
Self-test 256K passed!
[Wed Feb  5 21:37:27 2020]
Self-test 288K passed!
Self-test 288K passed!
Self-test 288K passed!
Self-test 288K passed!
Self-test 288K passed!
Self-test 288K passed!
Self-test 288K passed!
Self-test 288K passed!
[Wed Feb  5 21:44:29 2020]
Self-test 10K passed!
Self-test 10K passed!
Self-test 10K passed!
Self-test 10K passed!
Self-test 10K passed!
Self-test 10K passed!
Self-test 10K passed!
Self-test 10K passed!
[Wed Feb  5 21:52:37 2020]
Self-test 320K passed!
Self-test 320K passed!
Self-test 320K passed!
Self-test 320K passed!
Self-test 320K passed!
Self-test 320K passed!
Self-test 320K passed!
Self-test 320K passed!
[Wed Feb  5 22:00:51 2020]
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
Self-test 336K passed!
[Wed Feb  5 22:07:25 2020]
Self-test 12K passed!
Self-test 12K passed!
Self-test 12K passed!
Self-test 12K passed!
Self-test 12K passed!
Self-test 12K passed!
Self-test 12K passed!
Self-test 12K passed!
[Wed Feb  5 22:15:38 2020]
Self-test 384K passed!
Self-test 384K passed!
Self-test 384K passed!
Self-test 384K passed!
Self-test 384K passed!
Self-test 384K passed!
Self-test 384K passed!
Self-test 384K passed!
[Wed Feb  5 22:23:43 2020]
Self-test 400K passed!
Self-test 400K passed!
Self-test 400K passed!
Self-test 400K passed!
Self-test 400K passed!
Self-test 400K passed!
Self-test 400K passed!
Self-test 400K passed!
[Wed Feb  5 22:30:24 2020]
Self-test 15K passed!
Self-test 15K passed!
Self-test 15K passed!
Self-test 15K passed!
Self-test 15K passed!
Self-test 15K passed!
Self-test 15K passed!
Self-test 15K passed!
[Wed Feb  5 22:38:09 2020]
Self-test 448K passed!
Self-test 448K passed!
Self-test 448K passed!
Self-test 448K passed!
Self-test 448K passed!
Self-test 448K passed!
Self-test 448K passed!
Self-test 448K passed!
[Wed Feb  5 22:46:16 2020]
Self-test 480K passed!
Self-test 480K passed!
Self-test 480K passed!
Self-test 480K passed!
Self-test 480K passed!
Self-test 480K passed!
Self-test 480K passed!
Self-test 480K passed!
[Wed Feb  5 22:53:07 2020]
Self-test 16K passed!
Self-test 16K passed!
Self-test 16K passed!
Self-test 16K passed!
Self-test 16K passed!
Self-test 16K passed!
Self-test 16K passed!
Self-test 16K passed!
[Wed Feb  5 23:00:20 2020]
Self-test 512K passed!
Self-test 512K passed!
Self-test 512K passed!
Self-test 512K passed!
Self-test 512K passed!
Self-test 512K passed!

Thank you @cdhowie and @MichaelEischer! It was bad memory! I’ve been chasing after this for several months now and thought it was the connection. It’s a relief to finally have something definitive.

4 Likes

Thanks for following up! I’m moving this to the “getting help” section since redownloading packs wouldn’t’ve helped here anyway – and I’m not sure they ever would if data errors are detected. I believe files will be redownloaded if there is a transport level error, but once restic has the data, there is little reason to redownload as basically any transport in use with restic will ensure integrity of the data stream.

@cdhowie it looks like the solution is more elusive than we thought.

I decided to rerun memtest86+ but use the program defaults this time – SMP disabled. After completing 3 passes over 6 hours, it reported no memory errors. There is a Red Hat Bugzilla that reported false positives if SMP is enabled.

To double-check, I ran PassMark MemTest86. Their defaults has multi-CPU support. It also reported no memory errors after 3.5 hours.

Can you or @MichaelEischer suggest something else I can try?

@allan I like your avatar picture. That’s always something.

1 Like

Sorry for taking a long time to reply. I’ve hacked together a branch that checks packs that failed ciphertext verification a second time: https://github.com/MichaelEischer/restic/tree/strange-check-errors
You might want to give it a try.

No worries. It takes me some time to respond as well. :grinning: I’ll check out your branch, but I noticed that running the check with --no-cache option seems to be a fix. At least it has not happened in the past 4 checks.

@MichaelEischer, I have been running the check command from your branch regularly and I am waiting for the issue to occur again. So far no dice, and this is with the original memory too. I’ll report back when it happens again and I have additional logs.

1 Like