-o b2.connections=N seems to have no impact


#1

Hi,

(latest restic: 0.9.3, on debian 9, sufficient RAM and disk space)

as I’m in EU and ping delay to backblaze is about 180ms, I need concurrent connections.
Having tried settings of 10, 50, 100 and 500, I see no difference: the number of connections really open to backblaze (as reported by netstat -an | grep EST | grep 443 and filtering out the non-backblaze connections), is mostly 3, and sometimes goes to 5. Irrespective of -o b2.connections=...

Also (but that is not the issue, my question is about b2.connections), my upload speed is nowhere near my potential upload speed: 30Mbps instead of 300Mbps (this is a regular consumer fibre connection). Again, irrespective of the setting -o b2.connections=...

I confirmed that I used the correct command line correctly, through a debug build (and log), where I see:
2018/12/27 10:24:14 b2/b2.go:44 b2.Open 1 cfg b2.Config{AccountID:"********", Key:"**********", Bucket:"******", Prefix:"", Connections:0x32}

0x32 = 50 decimal, when I set -o b2.connections=50

I know that one can have multiple logical connections over the same open socket. I am not sure if the connection to backblaze can make use of that. But still, that is not my issue. This question is not about upload speed. (although that is what I want in the end, but let’s go one step at a time here)

My question:

  • what is the relation between b2.connections and the number of open TCP connections to backblaze? There seems to be none.
  • or is there a bug?

#2

What is the write speed of the disk where temporary packs are stored?

My thought is that restic probably won’t hold a connection open if there isn’t a pack waiting to be saved. If it can’t assemble packs fast enough due to poor disk I/O performance, there is no reason for it to hold idle connections open.

Do you see similar backup speeds when backing up to a local repository on the same disk as your temporary directory?


#3

The local scratch disk is on NVMe (>2GB/sec), so no bandwidth problems there. The data store I am backing up handles 200MB/sec at ease, and shows no significant wait time when using tools like atop. And I have loads of RAM and CPU available.

Just did a test to a local repository, from the data disk to my scratch disk: processed 553 files, 2.425 GiB in 0:32. Fast enough for me.

Disk speeds are not the issue, and neither are other local resources here.

I have another idea:
the files that are uploaded are about 5MB in size each. That is very small. In my situation, for one file, the latency induced by the https connection setup is BIGGER than the latency of the transfer of the file itself.
But I understand that the file size is actively maintained to be max 8MB. Hence the average of 4-5MB. And changing that limit looks like a major effort. But I may be wrong here.

Maybe restic handles only X files at a time? That would explain the low number of open connections and the “horrible” throughputs.

If my suspicions are right, how can I increase the number of files handled by restic? Or would increasing the max file size limit be possible? Or should I move to the rclone backend, and let it handle B2?


#4

I’ve seen some suspicion that restic doesn’t respond well to a connection count increase because it’s hard coded to only read 2 files at a time. If your data store handles concurrent reads well, you might try adjusting o.FileReadConcurrency. As @cdhowie suggests, though, it needs to create a full pack before uploading it to the object store. Have you checked for other bottlenecks? Is the backup CPU or memory bound (swap is slow, really really slow)?


#5

Will try o.FileReadConcurrency.
Just tried rclone (through the crypt remote), blazing fast with --transfers 32. Fills up my link.

No swap on my side, loads of RAM on that box, as I stated before. And as rclone shows: also no problems on backblaze’s side.


#6

You’re right, at the moment the backup operation is limited to at most two file uploads to B2 concurrently. You can restrict this with the command-line parameter -o b2.connections, but you cannot force the number of connections to be more than two. It’s an architectural limitation, the archiver (which at most reads two files concurrently) is also responsible for uploading data. Eventually we’ll split this out into its own worker pool, but that’s not done yet.

The file size of about 4-5MiB for each file was originally selected as a compromise for local/sftp storage, the cloud-based backends were added only afterwards. At some point we’ll probably exchange the static settings for a dynamic approach. You can play around with the constants in the source code if you like.


How do I pass the value -o b2.connections=50 to the RESTIC_REPOSITORY variable
#7

Why does the parameter exist if more than 2 connections are not possible? B2’s own documentation even gives an example of setting it to 10.

I arrived here because my upload to B2 is only using a couple connections and an overall transfer speed of 8-10 Mbps. My upload is 100 Mbps, and I’ve tested it capable of that.


#8

I don’t know why, but I’ll take an educated guess: Why not?

It was probably a control offered by the library used when B2 support was added, and although today restic doesn’t feed B2 with more than two files at a time, that could change.

Personally I added some space to a local server, have restic clients all upload there, and then rclone mirrors the content out to B2 utilizing the full bandwidth of two connections.


#9

You got it almost right: The setting -o b2.connections=N is there to limit the number of connections, it’s just an upper bound. When we introduced it, restic used way too much concurrency, so for some users it was best to limit the number of outgoing connections in order to not congest the (small) upstream bandwidth. At the moment, the actual upload is done by the file worker threads in the archiver code, of which there are two. The long-term plan is to decouple the upload process from the file reading process, so we can have more uploads going on.


#10

That would be great @fd0! These days managing bandwidth is seldom the issue, but managing latency is. We’re mostly using inter-cloud network bandwidth and even my home upstream bandwidth is 1Gbit. With current restic parallism and options, I can’t even saturate 1% of my home upstream bandwidth to high-latency repos :dizzy_face:.

If I move back to NZ, where they are getting ready for 10Gbit consumer connections. As you’d guess, the latency from NZ to anywhere in the world is huge. Restic max-parallelism speed from NZ to B2 will be ~0.1% of available bandwidth :slight_smile:

It would great to have the configuration options to re-enable some of that original parallelism, even if the default options are conservative. An important modernisation.