so I backed up 943 GB of data to B2. This took several weeks on my 5mb/s upload line. Once it finished, I immediately run a backup again, with no changed to the data whatsoever, just to find out how long it takes for restic to calculate whether there are any data differences to be uploaded. It turns out it took just short of 13 hours on my i3 (3.1GHz) CPU.
The number of concurrent connections to the B2 service can be set with the -o b2.connections=10. By default, at most five parallel connections are established.
Well, since it’s not really uploading anything, but just merely checking for diffs, do you think changing this parameter would significantly speed things up?
Could you please tell us which version of restic you use (run restic version and paste it here) and how exactly you run restic? Please tell us the complete command-line, especially the list of files/dirs to save. Then we can try to figure out what’s going on here. What is the first version of restic that you used to create the repo with?
Since 0.8.0, restic has a local metadata cache which makes incremental backups very fast, provided a few preconditions are met.
Hm, that sounds reasonable. There’s no easy way to debug what’s going on (yet, working on that), but can you have a look at what restic does when it takes so long? Does it use a lot of CPU maybe?
I see that there is a conservative default of 5, the example in the documentation is 10, but I’ve seen users reporting setting this value as high as 500.
What influences the optimal value for this parameter? Is it just guess and check? Is there a practical upper bound?
There is a practical upper bound of about 20 or so, because that’s the maximum concurrency that is used during backup. In the new archiver code (I’m currently working on that) it will be changed.
In the meantime, I’ve been uploading another repo. It finished the initial upload, and it is only about 45GB of data, so I thought I’d give the suggestions above a try with this one, since it’d take a lot less time, but would still be able to observe whether it uses a lot of CPU, like fd0 suggested, or whether increasing b2.connections improves performance, like moritzdietz suggested.
Here’s the output:
using parent snapshot abcdefg
scan [/some/dir]
scanned 728 directories, 11701 files in 0:00
[0:10] 100.00% 4.251 GiB/s 42.513 GiB / 42.513 GiB 12434 / 12434 items 0 errors ETA 0:00
duration: 0:10, 4135.88MiB/s
snapshot 411b2ea4 saved
password is correct
unable to create lock in backend: repository is already locked by PID 27112 on user by user (UID 1000, GID 1000)
lock was created at 2018-02-17 11:08:42 (26h58m20.196785292s ago)
storage ID 224afba4
10 sec! I’m still using 0.8.1, and a script identical to the one above (only paths and repo details change). I didn’t increment b2.connections and obviously didn’t have the chance to monitor CPU usage!
Oh, nice! That’s more in line with my expectations
What’s different in the two repos (besides the data size)? Did the previous run of restic (which took so long) print a line using parent snapshot? If not, I think I know what’s going on…
FYI, stale locks can be removed with restic unlock.
which I guess is too much more in line with your expectations!
The only difference between the repos I can think of is the medium the original data is stored in. To sum it up:
repo A: 943GB, 13 hours, external USB disk, my mobo only has USB 2.0 connectors, although I connected it via the case’s front panel USB port, which might be even slower, so I’ll try it again with a connector at the back (directly on mobo)
Unfortunately I didn’t take a note of that line fd0, whether it was there or not. I’ll give it another go (connected to the back panel USB this time) and report back!
Considering the symptoms only show for the repo where you backup from a USB drive, I would be inclined to think it’s related to that.
If it was my gear I’d probably take the disk from the USB enclosure out and put it on a SATA connection, and then run the test again to see how it fares.
OK, another suggestion then; Do the other two tests (start with B, the 42 GB one) the same way as before but with the source being on the USB disk instead (the same one that you backup your repo A from). See how the USB disk being involved affect the numbers for that same test. It will probably make it take much longer.
EDIT: I mean with the same source files, having copied them to the disk first. To make the test as similar as possible.
So, if for 42 gigs it takes half an hour, then for 943 GB it would take about 11 or so hours… So I huess the disk is the problem. That, or the fact that this disk is encrypted using veracrypt.
Other than that, it surprised me that backing up the same data, to the same destination, but from a different path, didn’t use the a parent snapshot (and no using parent snapshot abc... was printed. Actually, when the backup finished, it printed two separate lists of snapshots, like so:
snapshots for (host [username], paths [/original/path/mydata]):
keep 3 snapshots:
ID Date Host Tags Directory
----------------------------------------------------------------------
abcdefg 2018-02-16 15:13:55 username /original/path/mydata
oiuhjui 2018-02-17 14:11:12 username /original/path/mydata
mnjhuio 2018-02-17 14:12:44 username /original/path/mydata
----------------------------------------------------------------------
3 snapshots
snapshots for (host [username], paths [/usbdisk/mydata]):
keep 1 snapshots:
ID Date Host Tags Directory
----------------------------------------------------------------------
sdfredf 2018-01-12 19:03:02 pc /usbdisk/mydata
----------------------------------------------------------------------
1 snapshots
I wouldn’t have expected the data to be treated differently, but I guess it’s not, it’s just restic reporting it came from a different source, still doing all the deduplication work etc, right?
The source paths are different, so since restic cannot find a parent snapshot for the same path, it doesn’t use one. As you say though, deduplication is still working.