Dear all,
I witnessed some strange behaviour wrt. rest-server over the last few days, so I wanted to find out other opinions about this.
The setup is simple, we have one machine I will call “source” and another “target”. They are connected via a VPN, bandwidth unfortunately is not plentiful. “source” backs up to “target” every day at 4 in the morning. “target” has restic-server running, no TLS and no authentication. The amount of data being backed up is not huge, about 3GB (of changed data) per day. “target” saves to an ext4 file system on a raid1. restic version on “source” is 0.16.2, rest-server version on target is 0.12.1, restic version on “target” was 0.15. This has been running without a hitch for almost three months.
Last Sunday I ran “restic prune” on “target”, regaining about 30GB storage space, no errors shown. Access to the repo was not via rest-server, but directly via the file system (which should not make a difference, right?). Afterwards, the daily backup on “source” started to show lots of 500 errors like this:
load index files
Load(<index/af39686255>, 0, 0) returned error, retrying after 743.3945ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/69984be909>, 0, 0) returned error, retrying after 262.301053ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/7167170f17>, 0, 0) returned error, retrying after 664.075176ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/77bfd28ac1>, 0, 0) returned error, retrying after 373.363871ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/c2ea6623b4>, 0, 0) returned error, retrying after 583.165966ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/03709b8fb7>, 0, 0) returned error, retrying after 276.517677ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/1805b56007>, 0, 0) returned error, retrying after 297.219922ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/dc91f94f0b>, 0, 0) returned error, retrying after 334.475061ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/ca9db29d01>, 0, 0) returned error, retrying after 485.407709ms: unexpected HTTP response (500): 500 Internal Server Error
Load(<index/69984be909>, 0, 0) returned error, retrying after 1.000897011s: unexpected HTTP response (500): 500 Internal Server Error
After noticing the errors I logged into “target” to have a look around. I ran “restic repair index”, “restic repair packs” and finally “restic check --read-data”. No errors, everything fine. I then went back to “source” to retry the backup via rest-server. Same thing as before, lots of 500 errors. I then changed the backup location to sftp and tried again. No errors, backup went smoothly. Also, the backup via rest-server then worked again without errors …
Possibly, the rest-server errors were due to bandwidth problems, which could be hard to pin down. Or maybe the older restic version (0.15) being used for prune on “target”? Any other ideas?
Another thing, I have enabled the --debug flag now on “target”. Would that produce a stack trace on a 500 error? That would be very helpful.
best,