Local Synology Repository (sftp)

I’ve been running a restic to backup my Mac laptop using S3 for a couple of years now. I also used to have a repository on a local Drobo mounted via SMB. The Drobo died, and I’ve replaced it with a Synology NAS. Since I’ve got to rework my scripts/configuration for the Synology, I thought I’d try to use a service, as the mounting/unmounting from MAC can’t be done with securely for the command-line (as near as I can tell).

SFTP seemed like an easy answer, but I am having tremendous difficulty with getting the first big backup done. The command
restic -r sftp:restic-backup:/restic/repo backup --exclude-caches --exclude-file /var/tmp/restic.17099 --exclude-file /opt/etc/restic-exclusion-paths --exclude-file /opt/var/tm-standard-exclusion-paths --limit-upload 500 --one-file-system .

fails with:

Fatal: failed to refresh lock in time
error: no result
error: no result
error: no result
subprocess ssh: Connection to <redacted-host> closed by remote host.B, 3 errors ETA 53:02:26
subprocess ssh: client_loop: send disconnect: Broken pipe
Fatal: unable to save snapshot: context canceled

This is despite having the ssh config:

Host restic-backup
User restic
Port <other-port>
Hostname <redacted-host>
ServerAliveInterval 120
ServerAliveCountMax 360

I started with documented 60/240 ServerAlive options, but tried increasing it to no avail. I’m open to other protocols, but this is a local connection and seems like it should just work. Each execution seems to get me about 1% to 2% of my local storage.

restic 0.15.1 compiled with go1.19.5 on darwin/arm64

I know I’m out of date, but would prefer to solve the problem first before upgrading, if possible.

So some 2 years+ old restic version (there were 10+ releases since then) has some sftp issue. Why do you think anybody would be interested in such archaeology exercise? Especially that even briefly looking at these 10+ versions releases’ notes one can see that there were many sftp related fixes and improvements.

Big chance that you are trying to reinvent a wheel working on something what was already fixed. Good luck with that if you have time to waste:)

Restic is a single binary without any dependencies. Does not require special installation. You can download it from restic website (no need for any very often terribly outdated package manager) and run from some tmp folder for tests. Only reason not to use the latest one is in most cases plain laziness.

Given the number of fixes/enhancements that have been made to the sftp backend since 0.15.1, as Kapitainsky suggests, upgrading would seem to be the prudent course of action.
For what it’s worth, your errors seem to me to indicate the ssh server is terminating the connection, whether because of something restic is/isn’t doing, or because of the configuration of the ssh server. Have you thought about testing other long running tasks over ssh to see if they similarly are disconnected?

You probably want to go the other way actually, as in decrease the first number at least - ServerAliveInterval is how often the client will send a null packet to keep the connection alive, so if the sending a packet every 60s isn’t enough, you might try lowering it to 30.
ServerAliveCountMax is the upper ceiling for number of attempts the client will try to keep the connection alive if it isn’t able to. I’m not sure increasing this provides you any benefit, apart from meaning it takes much longer for the connection to time out after it drops. 240*60 is 4 hours, assuming I’m understanding the ssh_config manpage correctly.
This stackoverflow answer seems to have a decent summary of what both options do: https://unix.stackexchange.com/questions/3026/what-do-options-serveraliveinterval-and-clientaliveinterval-in-sshd-config-d

Anyway, if you’re adamant about not upgrading the restic binary (I can’t personally think of any reason you wouldn’t upgrade, but it’s your system), one alternative you could try is using rclone to serve the sftp back-end to restic. That way you’d be using rclones sftp implementation, rather than the implementation in the older restic version. If nothing else, this could let you eliminate restic as the source of the issue.
See the restic docs for details of how to set this up:

As well as the advice to drag yourself kicking and screaming to a modern version of restic, make sure you’re not doing sftp user/pass as that won’t work, you need to be using public key authentication for the SFTP backend.

This is fixed in restic 0.16.0: Restic can fail with `Fatal: failed to refresh lock in time` on slow connections · Issue #4199 · restic/restic · GitHub

--limit-upload 500 this limits the upload to 500KB/s which is insanely slow for a local backup target.