Determining whether there are any diffs to backup in 1TB takes about 13 hours

titos1 · February 17, 2018, 12:00am

Hi everyone,

so I backed up 943 GB of data to B2. This took several weeks on my 5mb/s upload line. Once it finished, I immediately run a backup again, with no changed to the data whatsoever, just to find out how long it takes for restic to calculate whether there are any data differences to be uploaded. It turns out it took just short of 13 hours on my i3 (3.1GHz) CPU.

scanned 4729 directories, 82724 files in 0:02
[12:49:53] 100.00%  20.922 MiB/s  943.781 GiB / 943.781 GiB  87453 / 87453 items  0 errors  ETA 0:00 
duration: 12:49:53, 20.92MiB/s

Isn’t that too long? I mean, if just a few MB of data change, it will still take at least 13 hours, right?

Are there any parameters I could use to minimize this time?

Thanks for you help in advance.

moritzdietz · February 17, 2018, 1:22pm

Hi
Did you read about the B2 backend here on Preparing a new repository — restic 0.16.3 documentation ?
Did you set the parameter for more connection to B2? That would most certainly speed things up.

The number of concurrent connections to the B2 service can be set with the -o b2.connections=10. By default, at most five parallel connections are established.

titos1 · February 17, 2018, 2:25pm

Well, since it’s not really uploading anything, but just merely checking for diffs, do you think changing this parameter would significantly speed things up?

I’ll give it a try anyway next time

fd0 · February 17, 2018, 5:20pm

@titos1 hey, that sounds odd and is not normal.

Could you please tell us which version of restic you use (run restic version and paste it here) and how exactly you run restic? Please tell us the complete command-line, especially the list of files/dirs to save. Then we can try to figure out what’s going on here. What is the first version of restic that you used to create the repo with?

Since 0.8.0, restic has a local metadata cache which makes incremental backups very fast, provided a few preconditions are met.

titos1 · February 17, 2018, 7:13pm

Hello fd0, I run latest 0.8.1, and I’m pretty sure I’ve always used this version. Or at the worst case 0.8.0.

restic 0.8.1
compiled with go1.9.2 on linux/amd64

To run restic, I use a script with the following contents:

source .restic-env;
/usr/local/bin/restic backup /media/veracrypt8/;

.restic-env contains the environmental variable declarations, like this:

export B2_ACCOUNT_ID="321321321321"
export B2_ACCOUNT_KEY="123123123123123123123123"
export RESTIC_REPOSITORY="b2:my-b2-repo-name"
export RESTIC_PASSWORD="my-pa55w0rd"

fd0 · February 17, 2018, 8:43pm

Hm, that sounds reasonable. There’s no easy way to debug what’s going on (yet, working on that), but can you have a look at what restic does when it takes so long? Does it use a lot of CPU maybe?

Sitwon · February 18, 2018, 5:32am

I see that there is a conservative default of 5, the example in the documentation is 10, but I’ve seen users reporting setting this value as high as 500.

What influences the optimal value for this parameter? Is it just guess and check? Is there a practical upper bound?

fd0 · February 18, 2018, 11:07am

There is a practical upper bound of about 20 or so, because that’s the maximum concurrency that is used during backup. In the new archiver code (I’m currently working on that) it will be changed.

titos1 · February 18, 2018, 12:17pm

In the meantime, I’ve been uploading another repo. It finished the initial upload, and it is only about 45GB of data, so I thought I’d give the suggestions above a try with this one, since it’d take a lot less time, but would still be able to observe whether it uses a lot of CPU, like fd0 suggested, or whether increasing b2.connections improves performance, like moritzdietz suggested.

Here’s the output:

using parent snapshot abcdefg
scan [/some/dir]
scanned 728 directories, 11701 files in 0:00
[0:10] 100.00%  4.251 GiB/s  42.513 GiB / 42.513 GiB  12434 / 12434 items  0 errors  ETA 0:00 
duration: 0:10, 4135.88MiB/s
snapshot 411b2ea4 saved
password is correct
unable to create lock in backend: repository is already locked by PID 27112 on user by user (UID 1000, GID 1000)
lock was created at 2018-02-17 11:08:42 (26h58m20.196785292s ago)
storage ID 224afba4

10 sec! I’m still using 0.8.1, and a script identical to the one above (only paths and repo details change). I didn’t increment b2.connections and obviously didn’t have the chance to monitor CPU usage!

fd0 · February 18, 2018, 12:38pm

Oh, nice! That’s more in line with my expectations

What’s different in the two repos (besides the data size)? Did the previous run of restic (which took so long) print a line using parent snapshot? If not, I think I know what’s going on…

FYI, stale locks can be removed with restic unlock.

titos1 · February 18, 2018, 12:56pm

Tried with yet another repo, of size 81GB, which was bound to have at least some diffs (contains my home dir). This one took about 17 min:

[17:06] 100.00%  81.647 MiB/s  81.806 GiB / 81.806 GiB  238433 / 238436 items  2 errors  ETA 0:00 
duration: 17:06, 81.59MiB/s
snapshot abcdefgh saved

which I guess is too much more in line with your expectations!

The only difference between the repos I can think of is the medium the original data is stored in. To sum it up:

repo A: 943GB, 13 hours, external USB disk, my mobo only has USB 2.0 connectors, although I connected it via the case’s front panel USB port, which might be even slower, so I’ll try it again with a connector at the back (directly on mobo)

repo B: 42GB, 10 sec, internal SATA disk

repo C: 81GB, 17 min, internal SSD disks

fd0 · February 18, 2018, 1:00pm

@titos1 Can you clarify on whether or not a line using parent snapshot was printed for the backup run that took so long?

The current master branch (after yesterdays release of 0.8.2) has many improvements which make backup to B2 even faster…

titos1 · February 18, 2018, 1:05pm

Unfortunately I didn’t take a note of that line fd0, whether it was there or not. I’ll give it another go (connected to the back panel USB this time) and report back!

titos1 · February 18, 2018, 2:16pm

OK, thus far the status of repo A, with no changes to data, is like this

using parent snapshot abcdefg
scan [/path/dir]
scanned 4728 directories, 82723 files in 1:45
[1:09:30] 8.84%  20.490 MiB/s  83.463 GiB / 943.781 GiB  16094 / 87452 items  0 errors  ETA 11:56:36

I stopped it, but it would obviously take around 13 hours. I’ll give it another try after upgrading to 0.8.2

titos1 · February 18, 2018, 3:57pm

Using v0.8.2:

using parent snapshot abcdefg
scan [/path/dir]
scanned 4728 directories, 82723 files in 0:02
[1:35:56] 12.50%  118.000 GiB / 943.781 GiB  20356 / 87452 items  0 errors  ETA 11:11:23

Same thing.

rawtaz · February 18, 2018, 4:11pm

Considering the symptoms only show for the repo where you backup from a USB drive, I would be inclined to think it’s related to that.

If it was my gear I’d probably take the disk from the USB enclosure out and put it on a SATA connection, and then run the test again to see how it fares.

titos1 · February 18, 2018, 4:49pm

I’m afraid I can’t do that, it’s one of those disks the enclosure of which I’d have to destroy in order to open it up!

Tried backing up repo A with -o b2.connections=20, still the same, about 13 hours…

rawtaz · February 18, 2018, 4:56pm

OK, another suggestion then; Do the other two tests (start with B, the 42 GB one) the same way as before but with the source being on the USB disk instead (the same one that you backup your repo A from). See how the USB disk being involved affect the numbers for that same test. It will probably make it take much longer.

EDIT: I mean with the same source files, having copied them to the disk first. To make the test as similar as possible.

titos1 · February 20, 2018, 11:56pm

So, as @rawtaz suggested, I copied the contents of repo B over to the external USB disk where repo A resides. Then I tried backing up that data:

scanned 728 directories, 11701 files in 0:00
[34:12] 100.00%  42.513 GiB / 42.513 GiB  12434 / 12434 items  0 errors  ETA 0:00 
duration: 34:12
snapshot abcdefgh saved

So, if for 42 gigs it takes half an hour, then for 943 GB it would take about 11 or so hours… So I huess the disk is the problem. That, or the fact that this disk is encrypted using veracrypt.

Other than that, it surprised me that backing up the same data, to the same destination, but from a different path, didn’t use the a parent snapshot (and no using parent snapshot abc... was printed. Actually, when the backup finished, it printed two separate lists of snapshots, like so:

snapshots for (host [username], paths [/original/path/mydata]):

keep 3 snapshots:
ID        Date                 Host        Tags        Directory
----------------------------------------------------------------------
abcdefg  2018-02-16 15:13:55  username                      /original/path/mydata
oiuhjui  2018-02-17 14:11:12  username                      /original/path/mydata
mnjhuio  2018-02-17 14:12:44  username                      /original/path/mydata
----------------------------------------------------------------------
3 snapshots

snapshots for (host [username], paths [/usbdisk/mydata]):

keep 1 snapshots:
ID        Date                 Host        Tags        Directory
----------------------------------------------------------------------
sdfredf  2018-01-12 19:03:02  pc                      /usbdisk/mydata
----------------------------------------------------------------------
1 snapshots

I wouldn’t have expected the data to be treated differently, but I guess it’s not, it’s just restic reporting it came from a different source, still doing all the deduplication work etc, right?

rawtaz · February 21, 2018, 7:06am

The source paths are different, so since restic cannot find a parent snapshot for the same path, it doesn’t use one. As you say though, deduplication is still working.