Fatal: repository contains errors

I’ve been using rclone and OneDrive for my restic repository. After reading the changelog for 0.16.1 about max. compression bug, I ran restic check --read-data. The result of it is

[147:10:25] 100.00%  54155 / 54155 packs
Fatal: repository contains errors

Above this there are something like

rclone: 2023/11/13 20:44:21 ERROR : data/0f/0fdd4094***: Didn't finish writing GET request (wrote 14122245/17124111 bytes): read tcp ***:***->***:443: i/o timeout
pack 0fdd4094*** failed to download: StreamPack: ReadFull: unexpected EOF

I could manually download all these ‘failed to download’ packs from OneDrive and verify their SHA256.

Does it mean my repository is healthy? Do I need to do anything else?

Can you please paste the complete output of the check --read-data instead of just parts of it?

The head:

using temporary cache in ***\AppData\Local\restic\restic-check-cache-1665341506
rclone: 2023/11/07 01:21:47 ERROR : Failed to save config after 10 tries: failed to move previous config to backup location: rename C:\ProgramData\scoop\apps\rclone\current\rclone.conf C:\ProgramData\scoop\apps\rclone\current\rclone.conf.old9857041777: Access is denied.
enter password for repository:
repository 8c055ea6 opened (version 2, compression level max)
created new cache in ***\AppData\Local\restic\restic-check-cache-1665341506
create exclusive lock for repository
load indexes
[1:23] 100.00%  13 / 13 index files loaded
check all packs
check snapshots, trees and blobs
[4:27] 100.00%  10 / 10 snapshots
read all data

And then the middle part:

rclone: 2023/11/07 03:38:39 ERROR : Failed to save config after 10 tries: failed to move previous config to backup location: rename C:\ProgramData\scoop\apps\rclone\current\rclone.conf C:\ProgramData\scoop\apps\rclone\current\rclone.conf.old3501875982: Access is denied.
rclone: 2023/11/07 03:46:44 ERROR : data/4e/4e9ac0d921daa65edc22d3***: Didn't finish writing GET request (wrote 17252352/17360874 bytes): read tcp ***->***:443: i/o timeout
Load(<data/4e9ac0d921>, 17360874, 0) returned error, retrying after 426.417049ms: ReadFull: unexpected EOF
Load(<data/4e9ac0d921>, 17360874, 0) operation successful after 1 retries
rclone: 2023/11/07 21:35:42 ERROR : data/e4/07c4d0ac3f428125254dd18e7300b733***: Didn't finish writing GET request (wrote 6093554/16944119 bytes): read tcp ***->***:443: i/o timeout
pack 07c4d0ac3f428125254dd18e7300b733a53484*** failed to download: StreamPack: ReadFull: unexpected EOF

The middle part is very long. I just pasted a small part of it, but the part I pasted covers all kinds of messages: GET requests for some packs were timed out, some of them succeeded after retries, and some of them failed to download finally; And rclone config moving errors that occured every once in a while.

Lastly the final part:

[147:10:25] 100.00%  54155 / 54155 packs
Fatal: repository contains errors

I am a standard user on Windows, rclone was installed globally with scoop by another administrator account. For this kind of rclone error I think it’s trivial.

rclone: 2023/11/07 03:38:39 ERROR : Failed to save config after 10 tries: failed to move previous config to backup location: rename C:\ProgramData\scoop\apps\rclone\current\rclone.conf C:\ProgramData\scoop\apps\rclone\current\rclone.conf.old3501875982: Access is denied.

Onedrive can be temperamental and apply throttling randomly depending on factors only Microsoft knows. Your description suggests that it is highly probable cause - random download problems when you hit Onedrive API hard (during restic check) but all works when you try to download the same file when it is “quiet”.

Most common issue is that by default rclone uses the same app client_id/secret for all people accessing onedrive using rclone on the planet… (it is hardcoded within the rclone binary). It is known, documented problem and solution exist:

Create your own keys - have a look at rclone docs.

When done recreate your rclone remote using freshly minted client_id and secret - it is not enough just to insert them into rclone.conf file as they have to be bound with access token.

It might be not enough as when for example you have fast Internet connection and low latency link to Microsoft data centre your traffic might hit throttling threshold regardless. Then you can throttle rclone speed using -o rclone.args restic options and pass additional rclone flags.

But I would start with client_id first.

That sounds like the only problem is that some files failed to download, which suggests that the repository is fine. However, there’s a certain risk that restoring data might run into similar problems.

Increasing the number of retries, might solve the problem, but that will require changes in restic and is currently not supported.