-o b2.connections=N seems to have no impact


#1

Hi,

(latest restic: 0.9.3, on debian 9, sufficient RAM and disk space)

as I’m in EU and ping delay to backblaze is about 180ms, I need concurrent connections.
Having tried settings of 10, 50, 100 and 500, I see no difference: the number of connections really open to backblaze (as reported by netstat -an | grep EST | grep 443 and filtering out the non-backblaze connections), is mostly 3, and sometimes goes to 5. Irrespective of -o b2.connections=...

Also (but that is not the issue, my question is about b2.connections), my upload speed is nowhere near my potential upload speed: 30Mbps instead of 300Mbps (this is a regular consumer fibre connection). Again, irrespective of the setting -o b2.connections=...

I confirmed that I used the correct command line correctly, through a debug build (and log), where I see:
2018/12/27 10:24:14 b2/b2.go:44 b2.Open 1 cfg b2.Config{AccountID:"********", Key:"**********", Bucket:"******", Prefix:"", Connections:0x32}

0x32 = 50 decimal, when I set -o b2.connections=50

I know that one can have multiple logical connections over the same open socket. I am not sure if the connection to backblaze can make use of that. But still, that is not my issue. This question is not about upload speed. (although that is what I want in the end, but let’s go one step at a time here)

My question:

  • what is the relation between b2.connections and the number of open TCP connections to backblaze? There seems to be none.
  • or is there a bug?

#2

What is the write speed of the disk where temporary packs are stored?

My thought is that restic probably won’t hold a connection open if there isn’t a pack waiting to be saved. If it can’t assemble packs fast enough due to poor disk I/O performance, there is no reason for it to hold idle connections open.

Do you see similar backup speeds when backing up to a local repository on the same disk as your temporary directory?


#3

The local scratch disk is on NVMe (>2GB/sec), so no bandwidth problems there. The data store I am backing up handles 200MB/sec at ease, and shows no significant wait time when using tools like atop. And I have loads of RAM and CPU available.

Just did a test to a local repository, from the data disk to my scratch disk: processed 553 files, 2.425 GiB in 0:32. Fast enough for me.

Disk speeds are not the issue, and neither are other local resources here.

I have another idea:
the files that are uploaded are about 5MB in size each. That is very small. In my situation, for one file, the latency induced by the https connection setup is BIGGER than the latency of the transfer of the file itself.
But I understand that the file size is actively maintained to be max 8MB. Hence the average of 4-5MB. And changing that limit looks like a major effort. But I may be wrong here.

Maybe restic handles only X files at a time? That would explain the low number of open connections and the “horrible” throughputs.

If my suspicions are right, how can I increase the number of files handled by restic? Or would increasing the max file size limit be possible? Or should I move to the rclone backend, and let it handle B2?


#4

I’ve seen some suspicion that restic doesn’t respond well to a connection count increase because it’s hard coded to only read 2 files at a time. If your data store handles concurrent reads well, you might try adjusting o.FileReadConcurrency. As @cdhowie suggests, though, it needs to create a full pack before uploading it to the object store. Have you checked for other bottlenecks? Is the backup CPU or memory bound (swap is slow, really really slow)?


#5

Will try o.FileReadConcurrency.
Just tried rclone (through the crypt remote), blazing fast with --transfers 32. Fills up my link.

No swap on my side, loads of RAM on that box, as I stated before. And as rclone shows: also no problems on backblaze’s side.


#6

You’re right, at the moment the backup operation is limited to at most two file uploads to B2 concurrently. You can restrict this with the command-line parameter -o b2.connections, but you cannot force the number of connections to be more than two. It’s an architectural limitation, the archiver (which at most reads two files concurrently) is also responsible for uploading data. Eventually we’ll split this out into its own worker pool, but that’s not done yet.

The file size of about 4-5MiB for each file was originally selected as a compromise for local/sftp storage, the cloud-based backends were added only afterwards. At some point we’ll probably exchange the static settings for a dynamic approach. You can play around with the constants in the source code if you like.