I’m using restic with resticprofile to backup my stuff to a Hetzner Storage Box via SFTP. The restic client is within a residential network, which force-reconnects every night at 4am to aquire a new IPv4-address (however, the backups take place via IPv6). The connection is interrupted for between 2 and at most 5 minutes during the forced disconnection.
My restic call (as called from resticprofile): restic backup --password-file=password.txt --repo=rclone:hetzner-backup:restic/bc-server --tag=auto --tag=system --verbose /home /root /boot /etc /var/www /var/dockervols
My problem is, that restic always fails at 4am every night. I’ve tried sftp directly (which does not have a retry mechanism apperently as mentioned in sftp backend does not reconnect · Issue #353 · restic/restic · GitHub). I’ve also tried sftp via rclone, but this fails as well with this message:
rclone: 2024/06/23 04:18:07 ERROR : sftp://redacted@redacted.your-storagebox.de:23/restic/bc-server: Discarding closed SSH connection: read tcp [redacted]:49396->[redacted]:23: read: connection timed out
rclone: 2024/06/23 04:18:07 ERROR : sftp://redacted@redacted.your-storagebox.de:23/restic/bc-server: Discarding closed SSH connection: read tcp [redacted]:34286->[redacted]:23: read: connection timed out
rclone: 2024/06/23 04:18:07 ERROR : sftp://redacted@redacted.your-storagebox.de:23/restic/bc-server: Discarding closed SSH connection: read tcp [redacted]:32836->[redacted]:23: read: network is unreachable
rclone: 2024/06/23 04:18:08 ERROR : data/b9/b94137e03d0173dced345378e6eab2f50d9a9eefe7eb7a1372c2f55feeb9f042: Post request put error: Update ReadFrom failed: connection lost
rclone: 2024/06/23 04:18:08 ERROR : data/b9/b94137e03d0173dced345378e6eab2f50d9a9eefe7eb7a1372c2f55feeb9f042: Post request rcat error: Update ReadFrom failed: connection lost
rclone: 2024/06/23 04:18:08 ERROR : data/e0/e0271416bd38b650d6656bf6fa03f9ba4ef90e1b827635c97641ef145423ecc9: Post request put error: Update ReadFrom failed: connection lost
rclone: 2024/06/23 04:18:08 ERROR : data/e0/e0271416bd38b650d6656bf6fa03f9ba4ef90e1b827635c97641ef145423ecc9: Post request rcat error: Update ReadFrom failed: connection lost
rclone: 2024/06/23 04:22:35 ERROR : locks/90a0c1b687cd068939d31a04a73c8714be86571b61151a1dad0631f18f346f5c: Post request put error: Put mkParentDir failed: mkdir dirExists failed: dirExists stat failed: connection lost
rclone: 2024/06/23 04:22:35 ERROR : locks/90a0c1b687cd068939d31a04a73c8714be86571b61151a1dad0631f18f346f5c: Post request rcat error: Put mkParentDir failed: mkdir dirExists failed: dirExists stat failed: connection lost
unable to refresh lock: server response unexpected: 500 Internal Server Error (500)
Fatal: unable to save snapshot: server response unexpected: 500 Internal Server Error (500)
There is also this PR Rework backend retries by MichaelEischer · Pull Request #4784 · restic/restic · GitHub, but I don’t know how this helps or how I configure the retry count.
Some more info:
Restic version: restic 0.16.4 compiled with go1.21.6 on linux/amd64
My .ssh/config:
Host redacted.your-storagebox.de
  Port 23
  User redacted
  IdentityFile ~/.ssh/id_ed25519
  ServerAliveInterval 60
  ServerAliveCountMax 240
My .config/rclone/rclone:
[hetzner-backup]
type = sftp
host = redacted.your-storagebox.de
user = redacted
port = 23
key_file = /root/.ssh/id_ed25519
key_use_agent = false
idle_timeout = 0
Do you know I get this to run without restarting restic every night again by hand? A script wouldn’t be suitable for me as the scanning process for every restic start takes very long, which I want to avoid. It doesn’t get in my head why restic does not do infinite retries by default, as it should be designed to run for a long time.
Any ideas? Thanks in advance.
