Errors restoring with restic on Windows Server + S3

Hello.
I have a Windows Server 2019 to backup (local 1,4 TB .vhdx file created with Windows Server Backup) and S3 backend.
I run backup command and it runs without any issue:

restic -r s3:https://s3.somecloudservice.com:443/backup --password-file C:\key.txt backup D:\WindowsImageBackup

Then I run restore and it shows errors during process:

C:\>restic -r s3:https://s3.somecloudservice.com:443/backup --password-file C:\key.txt restore b2058b61 --target D:\
repository c6c33138 opened (version 2, compression level auto)
[0:09] 100.00%  88 / 88 index files loaded
restoring <Snapshot b2058b61 of [D:\WindowsImageBackup] at 2023-11-02 17:28:17.3508685 +0300 MSK by addomain\user@server> to D:\
Load(<data/9084af745f>, 15123823, 1642536) returned error, retrying after 566.345324ms: ReadFull: read tcp 10.11.12.13:24815->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/9084af745f>, 15123823, 1642536) returned error, retrying after 566.345324ms: ReadFull: read tcp 10.11.12.13:24815->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/eb55b42404>, 16900531, 0) returned error, retrying after 442.352923ms: ReadFull: read tcp 10.11.12.13:24819->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: ReadFull: read tcp 10.11.12.13:24817->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: ReadFull: read tcp 10.11.12.13:37424->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: ReadFull: read tcp 10.11.12.13:24818->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writerd61d61c8-d73a-4eee-8cdd-f6f9786b7124.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: ReadFull: read tcp 10.11.12.13:37428->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writere8132975-6f93-4464-a53e-1050253ae220.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer0bada1de-01a9-4625-8278-69e735f39dd2.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer1072ae1c-e5a7-4ea1-9e4a-6f7964656570.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_AdditionalFilesc3b9f3c7-5e52-4d5e-8b20-19adc95a34c7.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer76fe1ac4-15f7-4bcd-987e-8e1acb462fb7.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writerafbab4a2-367d-4d15-a586-71dbb18f8485.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Logs\Backup_Error-01-11-2023_07-10-59.log: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer2a40fd15-dfca-4aa8-a654-1f8c654603f6.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Catalog\BackupGlobalCatalog: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer542da469-d3e1-473c-9f4f-7847f01fc64f.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer59b1f0cf-90ef-465f-9609-6ca8b2938366.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writera6ad56c2-b509-4e6c-bb19-49d8f43532f0.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer41e12264-35d8-479b-8e5c-9b23d1dad37e.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer7e47b561-971a-46e6-96b9-696eeaa53b2a.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\BackupSpecs.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\MediaId: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Components.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer75dfb225-e2e4-4d39-9ac9-ffaff65ddf06.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writerbe000cbe-11fe-4426-9c58-531aa6355fc4.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_RegistryExcludes.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer4969d978-be47-48b0-b100-f328f07ac1e0.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Backup 2023-11-01 071059\0d14c486-c3a0-46ea-9f3e-1c07b67894f5_Writer4dc3bdd4-ab48-4d07-adb0-3bee2926fd7f.xml: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
ignoring error for \D\WindowsImageBackup\server\Catalog\GlobalCatalog: StreamPack: ReadFull: read tcp 10.11.12.13:24816->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/8337c383f6>, 5599356, 11202383) returned error, retrying after 502.312451ms: read tcp 10.11.12.13:10117->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/8337c383f6>, 5599356, 11202383) operation successful after 1 retries
Load(<data/f153121586>, 17233225, 0) returned error, retrying after 644.634766ms: ReadFull: read tcp 10.11.12.13:17761->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/b5d7bafb21>, 9956283, 6351043) returned error, retrying after 474.278926ms: read tcp 10.11.12.13:17769->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/5380a41969>, 17823475, 0) returned error, retrying after 655.882519ms: ReadFull: read tcp 10.11.12.13:17775->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/ac1478ab01>, 15550945, 0) returned error, retrying after 371.79632ms: read tcp 10.11.12.13:17780->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/b5d7bafb21>, 9956283, 6351043) operation successful after 1 retries
Load(<data/ac1478ab01>, 15550945, 0) operation successful after 1 retries
Load(<data/f153121586>, 17233225, 0) operation successful after 1 retries
Load(<data/5380a41969>, 17823475, 0) operation successful after 1 retries
Summary: Restored 25 / 32 files/dirs (1.270 TiB / 1.271 TiB) in 3:33:50
Fatal: There were 29 errors

After restore I do hash check on restored file:

certutil -hashfile 31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx

and hash differs from the source file I sent to backup.
I tried different combinations of parameters on backup and restore with no success:

  1. Using rclone to serve restic
  2. –use-fs-snapshot
  3. -o s3.list-objects-v1=true
  4. -o s3.region=“provider-specific”
  5. –pack-size 8
  6. –pack-size 32

I tried creating new S3 repo, still no success.
I might think someting wrong with my S3 provider, but when I tried Kopia it backed up and restored the file with full hash match.
Is there anything I can try else to figure out problem?

My Linux backup command line looks like this. It might be worth seeing what happens when you change the URL pattern. Your way may be completely valid, or it may have found an odd edge case that works for backup and not restore. I don’t think this will solve the problem, but it’s worth a shot because it’s quick and easy.

/opt/restic/restic backup --repo s3:s3.amazonaws.com/bucket-name

Next, double check that the repo is created in S3? i.e. the data / index / keys folders are created, and there’s data in the data folder.

It looks to me like S3 is refusing the connection. Suggest you first check CloudTrail to see if there’s anything interesting in the log, but there probably won’t be.

Next turn on or create a CloudTrail that records S3 data events. Do a backup, wait 10 minutes (only to make it easier to find logs), then try a restore. Have a look in CloudTrail for one data store event during backup then one data retrieve event during restore. I’d like to see the CloudTrail event, but that might tell you the answer without you posting it.

Please report back what works and what doesn’t work, with as much information and log data as you’re comfortable sharing about what happens when you try the suggestions above.

Hi! I’m running almost the same setup as 6elk, backing up on Windows Server 2019, but instead of using Amazon as a cloud provider I’m using the S3 compatible Backblaze cloud.

The problem seems to be exactly the same, as the backup command runs without any errors and seems to be doing it’s job (the bucket is being updated regularly). Right now we are running a test restore to confirm restore time and the reliability of the data.

The error we are getting is the same and the commands too.

Any update on this issue?

Thanks!

Which restic version are you using?

The problem seems to be related to

Load(<data/9084af745f>, 15123823, 1642536) returned error, retrying after 566.345324ms: ReadFull: read tcp 10.11.12.13:24815->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/9084af745f>, 15123823, 1642536) returned error, retrying after 566.345324ms: ReadFull: read tcp 10.11.12.13:24815->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.
Load(<data/eb55b42404>, 16900531, 0) returned error, retrying after 442.352923ms: ReadFull: read tcp 10.11.12.13:24819->1.2.3.4:443: wsarecv: An existing connection was forcibly closed by the remote host.

(not sure why the 9084af745f line is duplicated). For those warnings there’s no corresponding operation successful after 1 retries message. My guess is that the wrong file hash is related to those errors, so there might be something wrong with the error handling. But I don’t have time to dig into that right now. I’ll try to have a closer look in the next few days.

@marcosppjr Can you provide a snippet of the error message you’re seeing?

Based on the error messages and the observed symptoms my guess for what’s going on is the following:

as there are no retries for 9084af745f and eb55b42404 it means that those pack files took longer than 15 minutes to process. That can happen due to two reasons: either the S3 network connection gets stuck for some reason or a certain blob is contained very often in the disk image (thus taking a lot of time until restore has written all occurrences to disk). My guess is that a combination of both problems causes the network connection to fail from time to time.

As a temporary workaround it should work to patch https://github.com/restic/restic/blob/1a5efcf68077e21805fc439c6349f2c010bf7dfd/internal/repository/repository.go#L31 to const MaxStreamBufferSize = 32 * 1024 * 1024 for --pack-size 16 (the default). This ensures that restore downloads the whole pack file at once and thus should avoid the download error if writing blobs takes too long. (The parameter combination from the first post can’t help here.)

I’m working on a more permanent fix, but implementing that properly requires more quite a few changes.

I’ve opened PR Improve reliability of large restores by MichaelEischer · Pull Request #4626 · restic/restic · GitHub which together with Improver restorer error reporting by MichaelEischer · Pull Request #4624 · restic/restic · GitHub should fix the issue.

Can MaxStreamBufferSize be accepted via command line? I tried restore with command line parameters

-o MaxStreamBufferSize=3210241024 --pack-size 16

but still see warnings with no corresponding successful retries.

No, MaxStreamBufferSize can only be modified in the source code.

Unfortunately, the problem persists on restic 0.16.3. Hashes for source and restored files mismatch. It is 1,6TB .vhdx drive image.
Restore log:

C:\>restic -r s3:https://s3.somecloudservice.com:443/backup --password-file C:\key.txt restore c9e59b65:\\server\d$ --target E:\
repository c6c33138 opened (version 2, compression level auto)
found 4 old cache directories in C:\Users\user\AppData\Local\restic, run `restic cache --cleanup` to remove them
[0:14] 100.00%  114 / 114 index files loaded
restoring <Snapshot c9e59b65 of [\\server\d$\WindowsImageBackup] at 2024-01-15 09:26:41.3129425 +0300 MSK by domain\user@server> to E:\
Load(<data/eb55b42404>, 16900531, 0) returned error, retrying after 394.952311ms: ReadFull: unexpected EOF
Load(<data/f3ffe2cc5c>, 11229660, 5845013) returned error, retrying after 383.219333ms: unexpected EOF
Load(<data/7d1a74e24a>, 3987572, 12679094) returned error, retrying after 596.568528ms: ReadFull: unexpected EOF
Load(<data/59450a4b06>, 17621629, 0) returned error, retrying after 273.556461ms: ReadFull: unexpected EOF
Load(<data/f3ffe2cc5c>, 11229660, 5845013) operation successful after 1 retries
Load(<data/7d1a74e24a>, 3987572, 12679094) operation successful after 1 retries
Load(<data/eb55b42404>, 16900531, 0) operation successful after 1 retries
ignoring error for \WindowsImageBackup\server\Backup 2024-01-11 124635\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: unexpected EOF
ignoring error for \WindowsImageBackup\server\Backup 2024-01-11 124635\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: ReadFull: unexpected EOF
ignoring error for \WindowsImageBackup\server\Backup 2024-01-11 124635\31845d83-74d5-4bce-a53e-a0356e5bf4f2.vhdx: StreamPack: ReadFull: unexpected EOF
Load(<data/35d1d22211>, 14977489, 0) returned error, retrying after 514.749702ms: unexpected EOF
Load(<data/35d1d22211>, 14977489, 0) returned error, retrying after 514.749702ms: unexpected EOF
Load(<data/2f7965fcb1>, 17288290, 0) returned error, retrying after 441.212871ms: ReadFull: unexpected EOF
Load(<data/8efa7a7031>, 16648477, 0) returned error, retrying after 687.260424ms: unexpected EOF
Load(<data/8efa7a7031>, 16648477, 0) returned error, retrying after 687.260424ms: unexpected EOF
Load(<data/ec481c8e2d>, 9578408, 7471499) returned error, retrying after 274.019978ms: unexpected EOF
Load(<data/ec481c8e2d>, 9578408, 7471499) operation successful after 1 retries
Load(<data/35d1d22211>, 14977489, 0) returned error, retrying after 627.648967ms: unexpected EOF
Load(<data/8efa7a7031>, 16648477, 0) operation successful after 1 retries
Load(<data/2e9178910e>, 4321302, 5127999) returned error, retrying after 501.787366ms: unexpected EOF
Load(<data/2f7965fcb1>, 17288290, 0) returned error, retrying after 508.060233ms: ReadFull: unexpected EOF
Load(<data/2e9178910e>, 4321302, 5127999) operation successful after 1 retries
Load(<data/35d1d22211>, 14977489, 0) operation successful after 2 retries
Load(<data/2f7965fcb1>, 17288290, 0) operation successful after 2 retries
Load(<data/45065bbc36>, 15254562, 0) returned error, retrying after 616.866126ms: ReadFull: unexpected EOF
Load(<data/dade5e7f32>, 6678410, 2519674) returned error, retrying after 376.62474ms: unexpected EOF
Load(<data/45065bbc36>, 15254562, 0) returned error, retrying after 589.051692ms: ReadFull: unexpected EOF
Load(<data/dade5e7f32>, 6678410, 2519674) operation successful after 1 retries
Load(<data/45065bbc36>, 15254562, 0) operation successful after 2 retries
Load(<data/a0ead00d10>, 12461047, 1794176) returned error, retrying after 555.620073ms: ReadFull: unexpected EOF
Load(<data/e4c6a0fc4d>, 14730395, 1197805) returned error, retrying after 381.755729ms: ReadFull: unexpected EOF
Load(<data/5edfa6b1ce>, 2203230, 14670286) returned error, retrying after 644.23286ms: ReadFull: unexpected EOF
Load(<data/a0ead00d10>, 12461047, 1794176) operation successful after 1 retries
Load(<data/5edfa6b1ce>, 2203230, 14670286) operation successful after 1 retries
Load(<data/e0c721c612>, 3686556, 1305739) returned error, retrying after 573.156842ms: unexpected EOF
Load(<data/e4c6a0fc4d>, 14730395, 1197805) operation successful after 1 retries
Load(<data/d072b76ba5>, 16477545, 0) returned error, retrying after 365.975865ms: ReadFull: unexpected EOF
Load(<data/e0c721c612>, 3686556, 1305739) operation successful after 1 retries
Load(<data/a80c516e45>, 5112680, 11823397) returned error, retrying after 287.479812ms: ReadFull: unexpected EOF
Load(<data/d072b76ba5>, 16477545, 0) returned error, retrying after 894.425187ms: ReadFull: unexpected EOF
Load(<data/a80c516e45>, 5112680, 11823397) returned error, retrying after 971.809337ms: ReadFull: unexpected EOF
Load(<data/a80c516e45>, 5112680, 11823397) operation successful after 2 retries
Load(<data/98c99556f7>, 17288920, 0) returned error, retrying after 462.82403ms: unexpected EOF
Load(<data/d7084a5499>, 16877468, 0) returned error, retrying after 651.776768ms: ReadFull: unexpected EOF
Load(<data/d072b76ba5>, 16477545, 0) returned error, retrying after 596.837829ms: unexpected EOF
Load(<data/d7084a5499>, 16877468, 0) operation successful after 1 retries
Load(<data/d072b76ba5>, 16477545, 0) operation successful after 3 retries
Load(<data/98c99556f7>, 17288920, 0) operation successful after 1 retries
Load(<data/45ebdde8da>, 16943811, 0) returned error, retrying after 351.860784ms: ReadFull: unexpected EOF
Load(<data/8dfbfda820>, 15056852, 0) returned error, retrying after 334.740261ms: ReadFull: unexpected EOF
Load(<data/45ebdde8da>, 16943811, 0) returned error, retrying after 624.241461ms: ReadFull: unexpected EOF
Load(<data/2c30545aeb>, 17215489, 0) returned error, retrying after 320.058853ms: ReadFull: unexpected EOF
Load(<data/4eaddfd23f>, 3035937, 4724367) returned error, retrying after 528.556072ms: ReadFull: unexpected EOF
Load(<data/2c30545aeb>, 17215489, 0) operation successful after 1 retries
Load(<data/8dfbfda820>, 15056852, 0) returned error, retrying after 1.004031569s: ReadFull: unexpected EOF
Load(<data/45ebdde8da>, 16943811, 0) returned error, retrying after 1.654457721s: ReadFull: unexpected EOF
Load(<data/45ebdde8da>, 16943811, 0) operation successful after 3 retries
Load(<data/4eaddfd23f>, 3035937, 4724367) operation successful after 1 retries
Load(<data/ad3708b258>, 5661690, 699917) returned error, retrying after 365.759779ms: unexpected EOF
Load(<data/ad3708b258>, 5661690, 699917) operation successful after 1 retries
Load(<data/e54a525a60>, 5811125, 1637095) returned error, retrying after 465.148631ms: ReadFull: unexpected EOF
Load(<data/ad3708b258>, 6090621, 10956080) returned error, retrying after 460.146439ms: ReadFull: unexpected EOF
Load(<data/8dfbfda820>, 15056852, 0) operation successful after 2 retries
Load(<data/ad3708b258>, 6090621, 10956080) operation successful after 1 retries
Load(<data/e54a525a60>, 5811125, 1637095) operation successful after 1 retries
Load(<data/ec860f2810>, 13917630, 0) returned error, retrying after 637.376012ms: unexpected EOF
Load(<data/ec860f2810>, 13917630, 0) operation successful after 1 retries
Load(<data/9fdb6d04ea>, 19652737, 0) returned error, retrying after 313.773852ms: unexpected EOF
Load(<data/47810b1a70>, 15183100, 0) returned error, retrying after 642.446302ms: unexpected EOF
Load(<data/20732516d5>, 16430193, 702853) returned error, retrying after 593.014692ms: ReadFull: unexpected EOF
Load(<data/47810b1a70>, 15183100, 0) operation successful after 1 retries
Load(<data/d70509d7cf>, 17138182, 0) returned error, retrying after 648.161506ms: ReadFull: unexpected EOF
Load(<data/d70509d7cf>, 17138182, 0) operation successful after 1 retries
Load(<data/9fdb6d04ea>, 19652737, 0) operation successful after 1 retries
Load(<data/20732516d5>, 16430193, 702853) operation successful after 1 retries
Load(<data/15690cb14a>, 16942709, 0) returned error, retrying after 508.704933ms: ReadFull: unexpected EOF
Load(<data/24b7e874ea>, 18541652, 544987) returned error, retrying after 529.226996ms: ReadFull: unexpected EOF
Load(<data/15690cb14a>, 16942709, 0) operation successful after 1 retries
Load(<data/422c23c4f4>, 17625276, 0) returned error, retrying after 266.40091ms: ReadFull: unexpected EOF
Load(<data/24b7e874ea>, 18541652, 544987) operation successful after 1 retries
Load(<data/422c23c4f4>, 17625276, 0) operation successful after 1 retries
Summary: Restored 30 / 31 files/dirs (1.493 TiB / 1.493 TiB) in 5:13:02
Fatal: There were 3 errors

Hmm, it looks like we’re heading into the right direction but we’re not quite there yet. Apparently, we also need to resolve the other two items discussed in Improve reliability of restore · Issue #4627 · restic/restic · GitHub . This will unfortunately take quite a bit of time to implement and get it into a released version; the changes are too invasive to include in a patch release.

The underlying problem is that a significant number of the network connections seem to be interrupted for some reason external to restic, which triggers problems with the request retry handling in restic.

As a temporary workaround you could patch restic using the following patch for restic 0.16.3 (this is the previously suggested change, plus an additional timeout change): (Feel free to ask if you need help compiling; it’s not too complicated with Go, even on Windows)

diff --git a/internal/backend/retry/backend_retry.go b/internal/backend/retry/backend_retry.go
index 9c51efedc..b10697559 100644
--- a/internal/backend/retry/backend_retry.go
+++ b/internal/backend/retry/backend_retry.go
@@ -73,6 +73,7 @@ func (be *Backend) retry(ctx context.Context, msg string, f func() error) error
                // speed up integration tests
                bo.InitialInterval = 1 * time.Millisecond
        }
+       bo.MaxElapsedTime = 60 * time.Minute
 
        err := retryNotifyErrorWithSuccess(f,
                backoff.WithContext(backoff.WithMaxRetries(bo, uint64(be.MaxTries)), ctx),
diff --git a/internal/repository/repository.go b/internal/repository/repository.go
index 0b50382b8..7f05e06f4 100644
--- a/internal/repository/repository.go
+++ b/internal/repository/repository.go
@@ -28,7 +28,7 @@ import (
        "golang.org/x/sync/errgroup"
 )
 
-const MaxStreamBufferSize = 4 * 1024 * 1024
+const MaxStreamBufferSize = 32 * 1024 * 1024
 
 const MinPackSize = 4 * 1024 * 1024
 const DefaultPackSize = 16 * 1024 * 1024

Alternatively, you could try whether setting the environment variable GODEBUG=http2client=0 changes anything.

I’m having significant problems trying to copy large snapshots between a S3 cloud service and a local repo. I’m using restic 0.16.4 on Ubuntu.

Load(<data/af2cd12f5d>, 16867901, 0) returned error, retrying after 919.631755ms: Get "https://s3.somecloudservice.com/bucket/data/af/af2cd12f5dbd16d5a074854df4d61f9d7db398df8f97fa3f0e7db2bbb05e71cc": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/af/af2cd12f5dbd16d5a074854df4d61f9d7db398df8f97fa3f0e7db2bbb05e71cc. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 519.79983ms: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 914.866669ms: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 1.380435756s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 1.318798524s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 1.786073249s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 3.838571678s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 5.318800934s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 4.354976446s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 10.458154634s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
Load(<data/9fbfcd04f0>, 19270402, 0) returned error, retrying after 11.76452645s: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.
[9:51:36] 91.03%  90895 / 99853 packs copied
could not load snapshots: context canceled
Fatal: StreamPack: Get "https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694": Connection closed by foreign host https://s3.somecloudservice.com/bucket/data/9f/9fbfcd04f050c9aff31c76f92b8792dc5cffc90290efcda41d4bb7bae3901694. Retry again.

@MichaelEischer is this related to Issue #4627 ?

Or am I experiencing a different problem entirely? The error messages seem to occur at random when using “restic copy”. Sometimes they happen early in the transfer…other times much later. At this time I’m considering using something like rclone to copy the entire repo on the S3 service and transfer it to a local repo system. Would that be recommended in a situation like this?

Thanks.

Issue #4627 is basically only relevant for the restore command, the copy command shouldn’t be affected.

Are you able to manually download the file that restic is having trouble with? It looks like the network connection was closed several times in a row which is rather unusual.

That would work as a workaround. Although, it’s probably easier to restart copy a few times, the command will continue at approximately the point at which the previous run failed.