Error while unlocking: rclone stdio connection already closed

Hello, and thank you for restic. I have been using it now for appr. 1.5 years, with rclone jottacloud as the destination. Works pretty well.

The last few days however, I get the same error everytime i run my script. The script does a backup, diff, restores a random file for testing, prune with purge and finally a check.

During the rebuilding of indexes after the prune, this error appears:

error while unlocking: rclone stdio connection already closedFatal: rclone stdio connection already closed
error in cleanup handler: rclone stdio connection already closed

Using this command:

flock -x /data/home/upback/restic/restic-lock /data/home/upback/restic/restic --cache-dir /data/home/upback/restic/cache --cleanup-cache -o rclone.program=/data/home/upback/restic/rclone -o ‘rclone.args=serve restic --stdio --b2-hard-delete --drive-use-trash=false --config=/data/home/upback/restic/rclone.conf --jottacloud-hard-delete’ -r rclone:jotta:restic/nas3 -p /data/home/upback/restic/restic-repopassword --verbose forget --group-by host --keep-within 21d --keep-daily 60 --keep-weekly 52 --keep-monthly 60 --keep-yearly 10 --prune

After which, the repo stays locked and I have to do an unlock, which succeeds.

Does any kind person knows what is going on here / what triggers this error?

Thank you.

Since it has been working without problems for a long time, and all of a sudden started having problems, something must have changed. So, what changed? :slight_smile:

I like that automated random file restore test of yours :eyes: :smiley:

Yep, something has changed but what? I was hoping for more insight in what triggers this error, in order to pinpoint the problem. The latest changes in the restic and rclone binaries, as well as in my script are older, so the change we are looking for is not that obvious.

That automated test restore is nice, but is on my wishlist to improve. Having a large repo with only a small daily delta, perhaps 90% the random files selected is a file that has not been changed since the very first backup.

Does the error only show up while unlocking? To debug this, I’d first try to let rclone be more verbose by adding -v to the rclone args. There must be some reason why rclone exits too early.

Did you check whether the kernel log contains anything suspicious (dmesg on linux), like a terminated process?

Yes, the error only shows up while unlocking, more specific: at unlocking after a forget/prune. I have rclone -v flag now starting the next run this night.

Thank you for pointing to the kernel logs, didn’t thought about them, should have done earlier because there it shows:
Out of memory: Kill process 22230 (restic) score 678 or sacrifice child
Killed process 22242 (rclone) total-vm:841860kB, anon-rss:143528kB, file-rss:0kB

This is on a nas with 2Gb memory, a repo size of 2Tb and 117 snapshots. Am I asking too much for that 2Gb internal memory? Does the recent compression (repo format 2) uses much more internal memory then without compresison (repo format 1)?

That sounds strange I’d rather expect the error to already trigger while updating the repository index.

It’s rather hard to make generic statements on that. The compressor will require additional memory, but the data to upload to the repository is smaller, which saves memory.

Either try running to go garbage collector more frequently by setting the environment variable GOGC=50 or reduce the number of CPU cores restic is allowed to use GOMAXPROCS=2.

Well, updating the index during the forget/prune run certainly does not go well (but shows no error) because after the unlock error restic complains and I have to rebuild the index, giving a few “adding pack file to index” notifications.

Thank you, will try.

Thank you, GOGC=50 seems to do the trick.