First snapshot in looping

Hello. I have a repository with 18 million files occupying 1.4T. The reading speed of the files is slow. When the Internet connection is unstable and it is necessary to restart the backup process, restic starts scanning all files again. So the system is in a loop. I can’t finish the first snapshot. The speed of the Internet link is not the limiting factor, as it has 100Mb of upload. Reading files is very slow due to other server processes. Is there any way for restic to restart the copy without doing the full scan again of the files? Thank you so much.

Example:
2021-05-11 05:24:07 new /e/Deptos/ALFA/35200826520188000192550010002293211192597968-nfe.xml
2021-05-13 20:51:16 new /e/Deptos/ALFA/35200826520188000192550010002293211192597968-nfe.xml

Please paste the complete output of the backup run, including the command and any relevant environment variables, where it starts the rescan you think shouldn’t happen. And if possible, why not add the -vv option as well when running it.

SOURCE=(“e:\Deptos” “d:\DATA”)
export RESTIC_PASSWORD=passwd

$ nice -n19 restic.exe -vvv --cleanup-cache --cache-dir e:.cache -r sftp::bkp -o “sftp.command=ssh user@host -p 222 -Txq -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o BatchMode=yes -s sftp” --exclude-file excludes.txt “${SOURCE[@]}” 2>> error.log | tee -a file.log

$ restic.exe version
restic 0.12.0 compiled with go1.15.8 on windows/amd64

$ free -m
total used free shared buff/cache available
Mem: 32768 4751 28016 0 0 28016
Swap: 49664 12645 37018

$ uname.exe -a
CYGWIN_NT-6.1 server 3.1.7(0.340/5/3) 2020-08-22 17:48 x86_64 Cygwin

$ cat /proc/cpuinfo
Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz

error.log

error while unlocking: ssh command exited: exit status 255Fatal: unable to save snapshot: ssh command exited: exit status 255
error in cleanup handler: ssh command exited: exit status 255
Save(<data/e42593c109>) returned error, retrying after 552.330144ms: wrote 229376 bytes instead of the expected 4780361 bytes
Save(<data/a1f630d3ee>) returned error, retrying after 593.411537ms: Write: failed to send packet: write |1: file already closed
Save(<data/1e44ee6ceb>) returned error, retrying after 400.45593ms: Write: failed to send packet: write |1: file already closed
Save(<data/a7b53b1b0b>) returned error, retrying after 468.857094ms: Write: failed to send packet: write |1: file already closed
Save(<data/12dc50fcd6>) returned error, retrying after 282.818509ms: Write: failed to send packet: write |1: file already closed
Save(<data/03d923aa9e>) returned error, retrying after 462.318748ms: Write: failed to send packet: write |1: file already closed
Save(<data/b0d06b0dc6>) returned error, retrying after 656.819981ms: Write: failed to send packet: write |1: file already closed
Save(<data/78a8425832>) returned error, retrying after 507.606314ms: Write: failed to send packet: write |1: file already closed
Save(<data/1afe793164>) returned error, retrying after 328.259627ms: Write: failed to send packet: write |1: file already closed
Save(<data/d28f54fe4b>) returned error, retrying after 582.280027ms: wrote 229376 bytes instead of the expected 4912940 bytes
Save(<data/6e8df2e709>) returned error, retrying after 298.484759ms: Write: failed to send packet: write |1: file already closed
Save(<data/022404e245>) returned error, retrying after 720.254544ms: Write: failed to send packet: write |1: file already closed
error while unlocking: ssh command exited: exit status 255Fatal: unable to save snapshot: ssh command exited: exit status 255
error in cleanup handler: ssh command exited: exit status 255

Need to send more information?

Unless I’m missing something, you don’t have an actual command in your restic run, and the output you pasted is not the “complete output of the backup run”. Please correct the provided information so it includes what was requested, thanks.

I did not understand your request. Can you give an example of what information you want?

No, not yet. See Quicker interrupted backup resumption

In short, there is this open PR:

If you are willing to use experimental code for your initial backup, you can compile this PR by yourself and use it until your initial backup is finished. Then, I would advise you to use an official release.

I really don’t know how to be more specific. I asked that you please paste the following:

  • The complete command you use when you run restic.
  • The complete output from the command when you run restic - all of it, and preferrably with the --vv option applied.
  • Any relevant environment variables, which I think you did provide, thank you.

Alexweiss, thank you very much for the clarification. I don’t have a development environment and I don’t know how to compile it. Is there any binary with this code?

@wendell77
To compile it, you need to install a go compiler, then run

  • git clone https://github.com/aawsome/restic.git'
  • git checkout backup-resume
  • (as an alternative to the two points above, download the source code from the github webpage)
  • go run build.go

you should then get the executable.

As far as I understood, @rawtaz asked you to give contents of your output file file.log as well.

@rawtaz seems to me that this are the environment variables and the command:

@alexweiss
Thank you very much for the instructions.

I started running the system as he indicated. It’s still running. The file.log already has more than 2G.

I really appreciate everyone’s attention. I will test the compiled version.