Restore using rclone GDrive backend is slow


#1

rclone v1.44
restic 0.93

Running on Debian 9

I have a 150/150 FTTP connection and I’m ~5ms away from the nearest Google CDN

Doing “standard” upload of a single 5GB file using rclone to my Google Drive runs about ~120mbps while a standard “download” maxes out my connection.

And I remember a “upload” using restic+rclone wasn’t consistent, but was often over 100mbps (using RCLONE_BWLIMIT=10M to avoid GDrive limits).

But when I try to restore using restic the speed is very slow, often not going above 30Mbps.

Any suggestions?

Thanks


#2

Yep: We’ve only recently (after the 0.9.3 release) merged the new concurrent restore code by @ifedorenko. It’ll be in the next release. If you want, you can try it out though: either compile the code in the master branch yourself, or use one of the beta builds.

The restore code in 0.9.3 (and before) is single-threaded and rather dumb.


#3

Built Restic from source: restic 0.9.3 (v0.9.3-53-g920727dd) compiled with go1.10.4 on linux/amd64

Copying a 5GB file from GDrive to my server using rclone is fast

Transferred: 5G / 5 GBytes, 100%, 19.822 MBytes/s, ETA 0s

Restic restoring a single 3GB file from GDrive to my server is slower, and is “bursty”, watching the restore of a file in real time is only about 2-4MB/s. But restoring a entire directory is faster(about 50-100Mbps).

Does restoring a single file only use one thread?

I mean, this is my second backup solution if in the unlikely event something really bad happened to my main NAS and my offsite backup drive(s) at the same time, but a good backup should be easy to restore, hence why I’m testing now.


#4

It wouldn’t surprise me if this was a limitation in Google Drive… :thinking: but I am not sure.


#5

I know the IOPS of Google Drive is quite limited (can only create 2-3 objects per second) so I wouldn’t be surprised if there was a limitation of access of a large number of objects. I know for sequential transfers people have gotten over a gigabit on servers with 10Gbps WAN connections.


#6

It’s likely that the limitation is indeed in GDrive. The restic repository consists of many smallish files (around 4-16MiB in size), which need to be accessed individually. That’s quite different from downloading a single large file.


#7

restic single file restore is indeed single-threaded, but multiple files are restored on multiple threads threads, up to 8 threads, if I am not mistaken. I have some ideas how to parallelize single-file restore, but want multi-file case stabilized first (I also think non-blocking prune is more important usecase than single-file restore).

I am not 100% sure why restic single-file restore is slower than rclone, but my guess this is because restic stores data in repository ~4MB “data pack files”, which means more download overhead compared to single 5G download.


#8

Just what I suspected. Thanks for the insight!