Good morning,
I work in the public sector, and over the past two years, we’ve transitioned our backup concept to restic.
We back up virtual machines (VMs) daily across multiple locations to local NAS systems using rest-server (TLS, append-only, private repos). Inside the VMs, we also perform daily backups of all data to a separate repository. Additionally, we back up all VMs weekly to a central location. All repositories are checked daily using restic check—more on this later.
Everything runs fully automated. Repository passwords, certificates, and other login data for rest-server are securely stored and can only be decrypted and executed by a specific user or Task. We are very satisfied with this system. Restic is efficient, fast, and meets our security requirements. More locations will be added soon. Therefore, I’d first like to say THANK YOU—thank you for this amazing project and your dedication!
Now to our issue:
We perform daily repository checks with restic check
. Sporadically, regardless of the location, we encounter the following errors:
Example:
restic.exe : Load(<data/d1afd7ff32>, 18755454, 0) returned error, retrying after 1.054762386s:
readFull: http2: client connection lost
In C:\XXXX\XXXX\restic_check.ps1:40 Zeichen:1
+ & C:\restic\restic.exe check `
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (Load(<data/d1af...connection lost:String) [], RemoteE
xception
+ FullyQualifiedErrorId : NativeCommandError
Load(<data/119dc143ea>, 18007490, 0) returned error, retrying after 1.379831123s: readFull: http2:
client connection lost
Load(<data/51aa92ab30>, 17019660, 0) returned error, retrying after 821.278225ms: readFull: http2:
client connection lost
Load(<data/51aa92ab30>, 17019660, 0) returned error, retrying after 821.278225ms: readFull: http2:
client connection lost
Load(<data/71460607f1>, 17227144, 0) returned error, retrying after 1.142240439s: readFull: http2:
client connection lost
Load(<data/51aa92ab30>, 17019660, 0) returned error, retrying after 821.278225ms: readFull: http2:
client connection lost
Load(<data/71460607f1>, 17227144, 0) returned error, retrying after 1.142240439s: readFull: http2:
client connection lost
Load(<data/d1279e34b3>, 17598027, 0) returned error, retrying after 1.273035571s: readFull: http2:
client connection lost
Load(<data/51aa92ab30>, 17019660, 0) operation successful after 1 retries
Load(<data/d1afd7ff32>, 18755454, 0) operation successful after 1 retries
Load(<data/71460607f1>, 17227144, 0) operation successful after 1 retries
Load(<data/d1279e34b3>, 17598027, 0) operation successful after 1 retries
Load(<data/119dc143ea>, 18007490, 0) operation successful after 1 retries
These “http2: client connection lost” errors occur exclusively during restic check.
In some cases, the process ends successfully with “operation successful after x retries”. However, in many cases, the process is aborted by restic.
Here’s an excerpt of our check script:
& C:XXX\restic.exe version
& C:\XXX\restic.exe check `
--retry-lock 1h `
--cleanup-cache `
--stuck-request-timeout 10m `
--read-data-subset=10% `
2> C:XXX\XXX\check_error_log_${BackupSystem}.txt
if ($LASTEXITCODE -ne 0) {
Send-CheckErrorMessage -BackupSystem $BackupSystem
}
We added the --stuck-request-timeout 10m
flag, but it didn’t improve the situation. Similarly, using -o http2.timeout=10m
didn’t help either.
Where do these errors come from? Why do they only occur during check, specifically during read operations?
We are using restic 0.17.3 and rest-server 0.11.
Thank you in advance!