"Fatal: repository contains errors" when running `restic check`

I run a nightly backup, forget, prune, and check and I have logs from several previous night’s runs. Last night’s script ran fine, as far as I can tell.

In the logs below I have replaced my name and bucket name using find/replace for posting here.

  • Restic version: 0.16.4
  • Storage backend: Wasabi
  • OS: ZorinOS 17.1 Core and Ubuntu Server 22.04.4 LTS
  • Restic repo and credentials in .bashrc

Log from last night’s backup run (from server):

using parent snapshot c49f891d

Files:        1151 new,  2526 changed, 311558 unmodified
Dirs:            8 new,   315 changed, 135865 unmodified
Added to the repository: 2.811 GiB (2.651 GiB stored)

processed 315235 files, 14.055 TiB in 8:23
snapshot f4ee96f9 saved
Applying Policy: keep 15 latest snapshots
keep 1 snapshots:
ID        Time                 Host          Tags        Reasons        Paths
---------------------------------------------------------------------------------------
a61c96c5  2024-03-19 16:34:04  NAME-desktop              last snapshot  /media/NAME/bak
---------------------------------------------------------------------------------------
1 snapshots

keep 15 snapshots:
ID        Time                 Host        Tags        Reasons        Paths
----------------------------------------------------------------------------------------
97077296  2024-03-16 02:00:02  media                   last snapshot  /mnt/cstor
                                                                      /opt/gitea
                                                                      /opt/mariadb
                                                                      /opt/mealie
                                                                      /opt/microsoft
                                                                      /opt/nextcloud
                                                                      /opt/paperless-ngx
                                                                      /opt/plex
                                                                      /opt/qbittorrent
                                                                      /opt/radarr
                                                                      /opt/scrutiny
                                                                      /opt/sonarr
                                                                      /opt/tautulli

SNIPPED FOR CHAR #

----------------------------------------------------------------------------------------
15 snapshots

remove 1 snapshots:
ID        Time                 Host        Tags        PathsDOES exist.

I had to just cancel the operation since I wasn’t getting anywhere.
-------------------------------------------------------------------------
1ac7071f  2024-03-15 02:00:02  media                   /mnt/cstor
                                                       /opt/gitea
                                                       /opt/mariadb
                                                       /opt/mealie
                                                       /opt/microsoft
                                                       /opt/nextcloud
                                                       /opt/paperless-ngx
                                                       /opt/plex
                                                       /opt/qbittorrent
                                                       /opt/radarr
                                                       /opt/scrutiny
                                                       /opt/sonarr
                                                       /opt/tautulli
-------------------------------------------------------------------------
1 snapshots

[0:00] 100.00%  1 / 1 files deleted

loading indexes...
loading all snapshots...
finding data that is still in use for 16 snapshots
[0:30] 100.00%  16 / 16 snapshots

searching used packs...
collecting packs for deletion and repacking
[1:19] 100.00%  858946 / 858946 packs processed


to repack:         70192 blobs / 843.935 MiB
this removes:       2834 blobs / 46.004 MiB
to delete:           112 blobs / 168.654 MiB
total prune:        2946 blobs / 214.658 MiB
remaining:      11335090 blobs / 13.830 TiB
unused size after prune: 0 B (0.00% of remaining size)

repacking packs
[3:00] 100.00%  52 / 52 packs repacked

rebuilding index
[1:38] 100.00%  858933 / 858933 packs processed

deleting obsolete index files
[0:01] 100.00%  77 / 77 files deleted

removing 62 old packs
[0:01] 100.00%  62 / 62 files deleted

done
using temporary cache in /tmp/restic-check-cache-2775554048
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
[1:42] 100.00%  16 / 16 snapshots

no errors were found

Today, I went to try out the rewrite command for some files/paths I didn’t want in the repo anymore but got this when I tried to ls (same happens on desktop and server):

Load(<index/9946adf5da>, 0, 0) returned error, retrying after 439.785859ms: We encountered an internal error.  Please retry the operation again later.

Wondering what was wrong, I tried running restic check -vvv which outputs:

using temporary cache in /tmp/restic-check-cache-4098491763
repository a5e65b45 opened (version 2, compression level auto)
created new cache in /tmp/restic-check-cache-4098491763
create exclusive lock for repository
load indexes
[0:49] 100.00%  76 / 76 index files loaded
check all packs
check snapshots, trees and blobs
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 647.683028ms: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 891.34287ms: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 855.816287ms: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 1.368401966s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 1.992779823s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 4.530065223s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 8.463502006s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 6.471228426s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 15.86836708s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 24.708158932s: We encountered an internal error.  Please retry the operation again later.
error for tree e5b84a44:
  ReadFull(<data/703ad7188e>): We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 347.783514ms: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 1.009947897s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 1.140624791s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 2.210227039s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 2.962526218s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 4.383491461s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 8.122642682s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 4.299789339s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 11.061535066s: We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 0, 0) returned error, retrying after 20.605353483s: We encountered an internal error.  Please retry the operation again later.
error for tree e37e930c:
  ReadFull(<data/703ad7188e>): We encountered an internal error.  Please retry the operation again later.
Load(<data/703ad7188e>, 255, 15990387) returned error, retrying after 261.679309ms: Get "https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1": Connection closed by foreign host https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1. Retry again.
Load(<data/703ad7188e>, 254, 16003632) returned error, retrying after 593.716267ms: Get "https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1": Connection closed by foreign host https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1. Retry again.
Load(<data/703ad7188e>, 255, 16005553) returned error, retrying after 462.649356ms: Get "https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1": Connection closed by foreign host https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1. Retry again.
Load(<data/703ad7188e>, 255, 16008001) returned error, retrying after 271.118326ms: Get "https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1": Connection closed by foreign host https://s3.wasabisys.com/BUCKET-NAME-HERE/data/70/703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1. Retry again.
  signal interrupt received, cleaning up
[6:18] 0.00%  0 / 16 snapshots
Fatal: repository contains error

I’ve gone to the Wasabi console and confirmed that 703ad7188ea08f3461a470401ccd79793803ad8cca85128c99f29cf7e54d7cc1 DOES exist.

I had to just cancel the operation since I wasn’t getting anywhere.

I’m currently running restic check --read-data but you can see that will take quite a while (thank god(s) Wasabi doesn’t charge for API calls).

Is there a way to get restic to pass on HTTP errors from Wasabi/S3? I can’t really tell if this is a repo problem or if Wasabi service is acting up.

Hoping to get other pointers here on ways forward. Hopefully I included enough info.

Update 1: The restic check --read-data has loaded indexes after a bunch of retries, that portion ends with:

operation successful after 7 retries

And now I’m on to reading packs. Thinking Wasabi is the one having issues.

Update 2: Keeping an eye on the restic check --read-data, I see this every now and then:

Load(<data/70b759d770>, 17115559, 0) returned error, retrying after 603.169242ms: ReadFull: unexpected EOF

I went as far as setting up the AWS CLI to check some of the objects that restic was complaining about so that I could use the --debug switch and got “404 object does not exist”.

I checked via the Wasabi Console and verified that it did exist.

I used the debug out put to open a case but Wasabi support said they didn’t see any 4xx’s on their end but plenty of 5xx’s.

I decided to restart my router, after which, all problems were gone… The issue was on my end the whole time though I have no idea exactly what it was.