StreamPack: ciphertext verification failed

Today I do a daily backup on my laptop and after a prune operation I see this error:

root@internal:~# restic -r s3:<my-bucket> prune --password-file "/my-secret.txt"
repository 02c1c5d6 opened (version 2, compression level auto)
loading indexes...
[0:00] 100.00%  2 / 2 index files loaded
loading all snapshots...
finding data that is still in use for 10 snapshots
[0:01] 100.00%  10 / 10 snapshots
searching used packs...
collecting packs for deletion and repacking
[0:02] 100.00%  2998 / 2998 packs processed

to repack:         20155 blobs / 41.386 MiB
this removes:        280 blobs / 30.108 MiB
to delete:           179 blobs / 430.218 MiB
total prune:         459 blobs / 460.325 MiB
not yet compressed:              15.412 GiB
remaining:         97254 blobs / 15.413 GiB
unused size after prune: 788.247 MiB (4.99% of remaining size)

repacking packs
[0:02] 50.00%  3 / 6 packs repacked
Fatal: StreamPack: ciphertext verification failed
root@internal:~#

apparently with no error in the forget or backup command.
after this error, I run immediately the check of the data and this is the result:

root@internal:~# restic -r s3:<my-bucket> check --read-data --password-file "/my-secret.txt"
using temporary cache in /tmp/restic-check-cache-359632852
repository 02c1c5d6 opened (version 2, compression level auto)
created new cache in /tmp/restic-check-cache-359632852
create exclusive lock for repository
load indexes
[0:06] 100.00%  2 / 2 index files loaded
check all packs
check snapshots, trees and blobs
[0:18] 100.00%  10 / 10 snapshots
read all data
[40:51] 100.00%  2998 / 2998 packs
no errors were found
root@internal:~# 

My restic version is (installed manually from github releases):

root@internal:~# restic version
restic 0.16.4 compiled with go1.21.6 on linux/amd64

Any idea of how I can proceed?
Thank you

As check --read-data states that your repository is intact, that means that the prune error was likely a temporary issue (likely a bitflip in hardware). I assume that if you run prune again, that the error does not show up again?

You could run a CPU / memory stress test on your laptop to make sure that the hardware is working reliably.

Nope, today after a new backup, forget and prune I see the same error.

repository 02c1c5d6 opened (version 2, compression level auto)
loading indexes...
[0:00] 100.00%  3 / 3 index files loaded
loading all snapshots...
finding data that is still in use for 10 snapshots
[0:01] 100.00%  10 / 10 snapshots
searching used packs...
collecting packs for deletion and repacking
[0:02] 100.00%  3004 / 3004 packs processed
to repack:         20417 blobs / 167.601 MiB
this removes:        715 blobs / 153.993 MiB
to delete:           322 blobs / 901.810 MiB
total prune:        1037 blobs / 1.031 GiB
not yet compressed:              15.289 GiB
remaining:         97295 blobs / 14.897 GiB
unused size after prune: 758.133 MiB (4.97% of remaining size)
repacking packs
[0:01] 8.00%  2 / 25 packs repacked
Fatal: StreamPack: ciphertext verification failed

How I can run a test on the cpu?

Hmm, that error could originate from the same pack file as before. Unfortunately the error message does not include the affected pack file (will be fixed by https://github.com/restic/restic/pull/4701). Please run restic with the environment variable DEBUG_LOG=restic.log. The relevant part of the log starts with the first line containing repository.streamPackPart.

What still doesn’t make sense for me is that check --read-data did not detect any data corruption. So let’s try the following: please look up the location of the cache folder used by restic (see restic cache) and rename that folder. Does prune work afterwards (it should still yield the same error as before, but let’s give it a try).

I usually use prime95 which has a stress test mode.

1 Like
2024/02/19 10:59:11 repository/repack.go:29     repository.Repack       1       repacking 63 packs while keeping 21394 blobs
2024/02/19 10:59:11 repository/repository.go:990        repository.streamPackPart       288     streaming pack 006078b13af29db699f0923e42f306bd69b2b3e716b92069>
2024/02/19 10:59:11 cache/file.go:38    cache.(*Cache).load     288     Load(<data/006078b13a>, 2168579, 865660) from cache
2024/02/19 10:59:11 cache/backend.go:172        cache.(*Backend).Load   288     error loading <data/006078b13a> from cache: open /root/.cache/restic/02c1c5d679>
2024/02/19 10:59:11 cache/backend.go:176        cache.(*Backend).Load   288     Load(<data/006078b13a>, 2168579, 865660): delegating to backend
2024/02/19 10:59:11 logger/log.go:45    logger.(*Backend).Load  288     Load(<data/006078b13a>, length 2168579, offset 865660)
2024/02/19 10:59:11 sema/semaphore.go:27        sema.semaphore.GetToken 288     acquired token
2024/02/19 10:59:11 debug/round_tripper.go:93   debug.loggingRoundTripper.RoundTrip     288     ------------  HTTP REQUEST -----------
GET /internal/data/00/006078b13af29db699f0923e42f306bd69b2b3e716b9206978c69e14d6574095 HTTP/1.1
Host: <host>
User-Agent: MinIO (linux; amd64) minio-go/v7.0.66
Authorization: **redacted**
Range: bytes=865660-3034238
X-Amz-Content-Sha256: e3b0c44298fc1c14<cutted>991b7852b855
X-Amz-Date: 20240219T095911Z

I have also removed the cache folder and redo a check, but after a new backup, forget and prune (with the same error) no errors was found:

root@internal:~# restic -r s3:<host> check --read-data --password-file "/my-secret.txt"
using temporary cache in /tmp/restic-check-cache-3203237224
repository 02c1c5d6 opened (version 2, compression level auto)
created new cache in /tmp/restic-check-cache-3203237224
create exclusive lock for repository
load indexes
[0:01] 100.00%  5 / 5 index files loaded
check all packs
check snapshots, trees and blobs
[0:03] 100.00%  10 / 10 snapshots
read all data
[16:16] 100.00%  3013 / 3013 packs
no errors were found

Oh, another thing that can be useful to understand the cause.
The machine is the same discussed in this post on GitHub:

It all looks like problems with your S3 storage. What is provider you are using?

I am using cubbit.io

today I run another prune enabling the debug flag and this is the output:

root@internal:~# restic -r s3:<my-bucket> prune --password-file "/my-secret.txt"
debug log file /root/.cache/restic.log
debug enabled
repository 02c1c5d6 opened (version 2, compression level auto)
loading indexes...
[0:00] 100.00%  7 / 7 index files loaded
loading all snapshots...
finding data that is still in use for 10 snapshots
[0:06] 100.00%  10 / 10 snapshots
searching used packs...
collecting packs for deletion and repacking
[0:01] 100.00%  3025 / 3025 packs processed

to repack:         23743 blobs / 445.035 MiB
this removes:       2262 blobs / 407.576 MiB
to delete:           951 blobs / 2.006 GiB
total prune:        3213 blobs / 2.404 GiB
not yet compressed:              15.022 GiB
remaining:         97782 blobs / 13.769 GiB
unused size after prune: 704.515 MiB (5.00% of remaining size)

repacking packs
body not drained, 0 bytes not readd
body not drained, 2071024 bytes not read, body: "2\x86\xa5ĕ\xfcM+\xb6\v\xea5Qsf\x96\x1d.\x111..."
body not drained, 0 bytes not read, error: context canceled
body not drained, 501248 bytes not read, body: "AwU\xe0\xca\xfe\x13ɗ<\xa8\xb6\x1e\x06H\x16~\x83\xf1\xd4..."
body not drained, 954368 bytes not read, error: context canceled, body: "\xf6\xccN\r\xb0\r't\x15\xc8\xc32\xc6\xc1\xdcY\xdeEt\xc4..."
[0:05] 3.95%  3 / 76 packs repacked
Fatal: StreamPack: ciphertext verification failed

As check --read-data does not report errors but prune does, let’s dig into the main difference here: check downloads the full file, whereas prune only downloads the relevant section (see the Range header). Apparently that S3 storage is unable to correctly handle the Range header.

To see whether that suspicion is warranted, please download the whole repository to a local folder and then run prune on that copy. If that works, then this is clearly a problem of the S3 storage.

If that S3 storage is indeed unable to correctly handle the Range header, that would also perfectly explain the errors from the Github issue. The pack header is the last part of a file, so restic only retrieves that part. Apparently, the returned data is either the wrong part of the file or just garbage and hence triggers the incomplete pack file in older restic versions.

Based on the other errors in the Github issue, I’d start worrying whether you’ll actually be able to restore your data if you need to.

I have downloaded with rclone all the bucket content to a local folder.
all seems to work fine, prune command too :upside_down_face:

Can this problem with my provider be resolved in any way?
Does restic have any other way to handle the prune?

I think you should open ticket with your provider. You are paying for S3 compatible product - and it seems it is not. Maybe some misconfiguration on their end or load/performance issues.

Quick google tells me that it is better to stay away from them, e.g.:

So you have what you pay for I am afraid.

1 Like

I will open a ticket with Cubbit and I will understand how I can proceed.
Thank you for now, I will update this topic when I know more

prune already contains several fallback mechanisms to try to handle limited corrupted data. But if the storage backend returns garbage, then that is a bug of the storage provider, not of restic. Thus, this only can be fixed by the storage provider. We’re not going to introduce workarounds for weirdly broken storage providers.

I agree with you! My intention wasn’t to suggest introducing a workaround.
I am sorry, I mean if the prune command already has some parameter to avoid using this header.

Anyway, I have contacted the provider and it guarantees me that the range header is supported :thinking:
I will try to go deeper in the argument.

If Range is supported what else can go wrong?

Well, problem is not that range header is not supported:) It happily accepts request - but returns wrong data.

I use restic myself with couple of S3 providers - no issues at all.

Now as I understand from your post it used to work in the past - so something changed. If not restic version than something on your provider site. If you run daily backups you know when it stopped working - maybe it will give them some clue as it could be that around that date they updated some software - and it has a bug.

Hi to all

I have some updates!
I have configured a new repo on a new machine in a new bucket (always on cubbit). All the operations works great.

Here you can find a log of a prune operation.

root@services:~# restic -r s3:s3.cubbit.eu/<other-bucket> prune --password-file "/the-secret.txt"
repository e724f8be opened (version 2, compression level auto)
loading indexes...
[0:00] 100.00%  1 / 1 index files loaded
loading all snapshots...
finding data that is still in use for 9 snapshots
[0:00] 100.00%  9 / 9 snapshots
searching used packs...
collecting packs for deletion and repacking
[0:00] 100.00%  57 / 57 packs processed
to repack:          1825 blobs / 1.731 MiB
this removes:         39 blobs / 59.296 KiB
to delete:             0 blobs / 0 B
total prune:          39 blobs / 59.296 KiB
remaining:         10182 blobs / 809.536 MiB
unused size after prune: 59.133 KiB (0.01% of remaining size)
repacking packs
[0:01] 100.00%  3 / 3 packs repacked
rebuilding index
[0:02] 100.00%  56 / 56 packs processed
deleting obsolete index files
[0:00] 100.00%  1 / 1 files deleted
removing 3 old packs
[0:03] 100.00%  3 / 3 files deleted
done
root@services:~#

In my last post it worked because after contacting Cubbit support via email they immediately released a fix for the bug described in this post (I also sent them the link to this thread).
Only yesterday they sent me an email to confirm that the bug has been permanently fixed.
I thank everyone for their willingness to support me <3

1 Like