Slow recovery speed

Hi, I’m going to use restic to perform block device backup.

I run the following command to perform the backup:

cat /dev/drbd/by-res/backup-one-vm-8947-disk-2/0 | \
  restic backup --tag=one-vm-8947-disk-2 --stdin --stdin-filename /dev/drbd/by-res/one-vm-8947-disk-2/0

Right now I have already 37TB used of 1299 snapshots, located on software zfs draid of 45x6TB disks

backup speed is pretty fine and fully utilizes 10gb/s line:

https://asciinema.org/a/gg8zH0WtQwcKyR13bR3gj7jqf

However the recovery from the backup is quite slow even if running locally:

# restic -r /data/backup --verbose dump 46a169a5 /dev/drbd/by-res/one-vm-7994-disk-2/0 | dd of=/dev/null status=progress
2960962048 bytes (3.0 GB) copied, 222.868152 s, 13.3 MB/s

and ends with:

41943719+13209 records in
41950384+0 records out
21478596608 bytes (21 GB) copied, 833.371 s, 25.8 MB/s

Thus the speed of recovery usually it is not going over 26MB/s

CPU usage isn’t so high:

# top
top - 17:03:48 up 1 day, 21:21,  6 users,  load average: 4.29, 3.51, 3.82
Tasks: 530 total,   1 running, 380 sleeping,   0 stopped,   0 zombie
%Cpu(s): 16.5 us,  0.5 sy,  0.0 ni, 75.1 id,  7.9 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 74190032 total, 23056312 free, 47609356 used,  3524368 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 25678432 avail Mem 
# top | grep restic
14709 root      20   0   12.6g  12.3g  13320 S  77.8 17.3  11:51.01 restic                                                                            
14709 root      20   0   12.6g  12.3g  13320 S  75.8 17.3  11:53.30 restic                                                                            
14709 root      20   0   12.6g  12.3g  13320 S  71.0 17.3  11:55.45 restic                                                                            
14709 root      20   0   12.6g  12.3g  13320 S  71.2 17.3  11:57.60 restic                                                                            
14709 root      20   0   12.6g  12.3g  13320 S  63.9 17.3  11:59.53 restic                                                                            
14709 root      20   0   12.6g  12.3g  13320 S  58.9 17.3  12:01.31 restic   
# htop
  1  [|||   16.7%]   5  [       0.0%]   9  [||||||47.5%]   13 [||    17.9%]
  2  [||     4.0%]   6  [|||   14.4%]   10 [||||  33.5%]   14 [|      7.0%]
  3  [|      0.5%]   7  [||||||51.0%]   11 [||||  25.4%]   15 [       0.0%]
  4  [       0.0%]   8  [||    12.9%]   12 [||||||45.0%]   16 [||     2.0%]
  Mem[|||||||||||||||||||49.0G/70.8G]   Tasks: 195, 864 thr; 5 running
  Swp[                         0K/0K]   Load average: 4.30 3.60 3.84 
                                        Uptime: 1 day, 21:22:17

Memory is enough:

# free -h
             total        used        free      shared  buff/cache   available
Mem:            70G         49G         19G        244M        2.0G         20G
Swap:            0B          0B          0B

Utilization of the disks fluctuates between 0 and 17%

# iostat -x 2
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           6.54    0.00    2.01    0.91    0.00   90.53

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    9.00    0.50   170.00     8.00    37.47     0.06    5.74    6.06    0.00   5.79   5.50
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdas              0.00     0.00    5.50    0.00   484.00     0.00   176.00     0.06   10.45   10.45    0.00   7.55   4.15
sdt               0.00     0.00    6.00    0.00   504.00     0.00   168.00     0.04    7.25    7.25    0.00   5.42   3.25
sdaj              0.00     0.00    4.50    0.00   342.00     0.00   152.00     0.04    8.44    8.44    0.00   7.67   3.45
sdaa              0.00     0.00    5.00    0.00   404.00     0.00   161.60     0.04    7.30    7.30    0.00   6.50   3.25
sdar              0.00     0.00    0.50    0.00    42.00     0.00   168.00     0.00    2.00    2.00    0.00   2.00   0.10
sdac              0.00     0.00    5.50    0.00   462.00     0.00   168.00     0.03    5.18    5.18    0.00   4.00   2.20
sdp               0.00     0.00    0.50    0.00    44.00     0.00   176.00     0.00    0.00    0.00    0.00   0.00   0.00
sdy               0.00     0.00    4.00    0.00   336.00     0.00   168.00     0.02    4.75    4.75    0.00   4.62   1.85
sdat              0.00     0.00    4.00    0.00   352.00     0.00   176.00     0.02    4.25    4.25    0.00   4.38   1.75
sdw               0.00     0.00    4.50    0.00   360.00     0.00   160.00     0.05   10.22   10.22    0.00   9.33   4.20
sdi               0.00     0.00    0.50    0.00    44.00     0.00   176.00     0.00    7.00    7.00    0.00   7.00   0.35
sdc               0.00     0.00    5.50    0.00   462.00     0.00   168.00     0.05    9.36    9.36    0.00   5.73   3.15
sdn               0.00     0.00    4.00    0.00   352.00     0.00   176.00     0.02    5.12    5.12    0.00   5.00   2.00
sdal              0.00     0.00    5.50    0.00   462.00     0.00   168.00     0.04    6.45    6.45    0.00   5.45   3.00
sdx               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdah              0.00     0.00    4.50    0.00   378.00     0.00   168.00     0.02    3.89    3.89    0.00   3.78   1.70
sdag              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdl               0.00     0.00    4.00    0.00   336.00     0.00   168.00     0.01    3.38    3.38    0.00   3.25   1.30
sdu               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sde               0.00     0.00    6.00    0.00   528.00     0.00   176.00     0.06    9.58    9.58    0.00   6.08   3.65
sdad              0.00     0.00    6.50    0.00   402.00     0.00   123.69     0.03    4.69    4.69    0.00   4.62   3.00
sdq               0.00     0.00    4.50    0.00   354.00     0.00   157.33     0.03    6.00    6.00    0.00   6.00   2.70
sdam              0.00     0.00    5.50    0.00   462.00     0.00   168.00     0.09   15.27   15.27    0.00   9.45   5.20
sdr               0.00     0.00    4.50    0.00   360.00     0.00   160.00     0.03    6.78    6.78    0.00   6.11   2.75
sdap              0.00     0.00    4.50    0.00   378.00     0.00   168.00     0.02    4.33    4.33    0.00   4.00   1.80
sdz               0.00     0.00    5.00    0.00   402.00     0.00   160.80     0.03    5.40    5.40    0.00   4.80   2.40
sdk               0.00     0.00    5.50    0.00   484.00     0.00   176.00     0.06   11.27   11.27    0.00   7.55   4.15
sdaq              0.00     0.00    5.50    0.00   462.00     0.00   168.00     0.04    7.55    7.55    0.00   5.64   3.10
sdau              0.00     0.00    5.00    0.00   402.00     0.00   160.80     0.04    7.70    7.70    0.00   7.00   3.50
sdh               0.00     0.00    5.50    0.00   462.00     0.00   168.00     0.07   12.18   12.18    0.00   8.36   4.60
sdae              0.00     0.00    8.00    0.00   374.00     0.00    93.50     0.08   10.62   10.62    0.00  10.31   8.25
sdo               0.00     0.00    4.00    0.00   336.00     0.00   168.00     0.02    4.25    4.25    0.00   4.12   1.65
sdg               0.00     0.00    4.50    0.00   342.00     0.00   152.00     0.04    7.89    7.89    0.00   7.56   3.40
sdak              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdao              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdai              0.00     0.00    4.50    0.00   360.00     0.00   160.00     0.02    3.78    3.78    0.00   3.00   1.35
sds               0.00     0.00    5.50    0.00   484.00     0.00   176.00     0.04    6.82    6.82    0.00   5.18   2.85
sdv               0.00     0.00    4.50    0.00   360.00     0.00   160.00     0.04    8.44    8.44    0.00   7.56   3.40
sdm               0.00     0.00    5.50    0.00   462.00     0.00   168.00     0.07   12.45   12.45    0.00   8.64   4.75
sdd               0.00     0.00    4.50    0.00   342.00     0.00   152.00     0.04    8.78    8.78    0.00   7.89   3.55
sdj               0.00     0.00    4.00    0.00   336.00     0.00   168.00     0.02    5.25    5.25    0.00   5.00   2.00
sdf               0.00     0.00    4.50    0.00   342.00     0.00   152.00     0.05   10.44   10.44    0.00   9.44   4.25
sdaf              0.00     0.00    4.50    0.00   360.00     0.00   160.00     0.04    8.78    8.78    0.00   8.22   3.70
sdan              0.00     0.00    4.00    0.00   336.00     0.00   168.00     0.02    5.75    5.75    0.00   5.88   2.35
sdab              0.00     0.00    5.50    0.00   382.00     0.00   138.91     0.02    3.73    3.73    0.00   3.45   1.90

Any ideas?

I love the system you have there. Receiving restic backups at a rate of 1 gigabyte per second, without breaking a sweat and with rest-server at not even 5% CPU, is just lovely. Can you share more details about this server, and how the ZFS is configured/set up (both hardware and settings)?

Which restic version were you running these tests on? You can get it by restic version.

This merged PR I believe significantly improves restore performance, if you haven’t tried it yet you can try the latest master (which includes this PR) by downloading it from beta.restic.net unless you want to build it yourself from the source.

I use draid this is not released feature of zfs yet

https://openzfs.github.io/openzfs-docs/Basics%20concepts/dRAID%20Howto.html

My configuration is draid2 with 3 groups, 2 distributed spares and 45 disks:

# zpool status data
  pool: data
 state: ONLINE
  scan: none requested
config:

	NAME                 STATE     READ WRITE CKSUM
	data                 ONLINE       0     0     0
	  draid2:3g:2s-0     ONLINE       0     0     0
	    sdy              ONLINE       0     0     0
	    sdam             ONLINE       0     0     0
	    sdf              ONLINE       0     0     0
	    sdau             ONLINE       0     0     0
	    sdab             ONLINE       0     0     0
	    sdo              ONLINE       0     0     0
	    sdw              ONLINE       0     0     0
	    sdak             ONLINE       0     0     0
	    sdd              ONLINE       0     0     0
	    sdas             ONLINE       0     0     0
	    sdm              ONLINE       0     0     0
	    sdu              ONLINE       0     0     0
	    sdai             ONLINE       0     0     0
	    sdaq             ONLINE       0     0     0
	    sdk              ONLINE       0     0     0
	    sds              ONLINE       0     0     0
	    sdag             ONLINE       0     0     0
	    sdi              ONLINE       0     0     0
	    sdq              ONLINE       0     0     0
	    sdae             ONLINE       0     0     0
	    sdz              ONLINE       0     0     0
	    sdan             ONLINE       0     0     0
	    sdg              ONLINE       0     0     0
	    sdac             ONLINE       0     0     0
	    sdx              ONLINE       0     0     0
	    sdal             ONLINE       0     0     0
	    sde              ONLINE       0     0     0
	    sdat             ONLINE       0     0     0
	    sdaa             ONLINE       0     0     0
	    sdn              ONLINE       0     0     0
	    sdv              ONLINE       0     0     0
	    sdaj             ONLINE       0     0     0
	    sdc              ONLINE       0     0     0
	    sdar             ONLINE       0     0     0
	    sdl              ONLINE       0     0     0
	    sdt              ONLINE       0     0     0
	    sdah             ONLINE       0     0     0
	    sdap             ONLINE       0     0     0
	    sdj              ONLINE       0     0     0
	    sdr              ONLINE       0     0     0
	    sdaf             ONLINE       0     0     0
	    sdao             ONLINE       0     0     0
	    sdh              ONLINE       0     0     0
	    sdp              ONLINE       0     0     0
	    sdad             ONLINE       0     0     0
	spares
	  s0-draid2:3g:2s-0  AVAIL   
	  s1-draid2:3g:2s-0  AVAIL   

errors: No known data errors

ZFS is bit tuned like this:

zfs set recordsize=1M xattr=sa atime=off data/backup

Previously I made a big benchmark of various raid levels and erasure coding configurations with minio, and found draid very promising.

As about hardware:
Server is Fujitsu Primergy RX300 S7 with Intel® Xeon® CPU E5-2650L 0 @ 1.80GHz processor and 9x Samsung DDR3-1333 8Gb PC3L-10600R ECC Registered (M393B1K70DH0-YH9) RAM modules and 45x Seagage ST6000NM0115-1YZ110 6TB disks

As about versions I’ve tried both, my current one and latest master:

restic 0.9.6 compiled with go1.13.4 on linux/amd64
restic 0.9.6 (v0.9.6-207-g84475aa3) compiled with go1.14.3 on linux/amd64

Results the same:
When I run restore command restic is thinking 5 minutes and only after that starting the recover process, speed is still not so high:

# /tmp/restic -r /data/backup --verbose dump 46a169a5 /dev/drbd/by-res/one-vm-7994-disk-2/0 | dd of=/dev/null bs=64k status=progress
21475073287 bytes (21 GB) copied, 896.290172 s, 24.0 MB/s
321266+13230 records in
321266+13230 records out
21478596608 bytes (21 GB) copied, 896.813 s, 23.9 MB/s

Any other clue?

1 Like

@ifedorenko Ping :slight_smile: Do you have any ideas on this one?

I just found his answer in similar topic:

So I tried to compare, indeed speed of restore operation is 2 times better than dump:

Now restic is thinking only two minutes instead of five before starting the recovery process and finishes recovery for 5-6 minutes

# time /tmp/restic -r /data/backup/ restore 70db3250 --verbose --target /tmp/some
repository 94affe3c opened successfully, password is correct
restoring <Snapshot 70db3250 of [/dev/drbd/by-res/one-vm-5525-disk-0/0] at 2020-05-24 13:46:03.840012085 +0000 UTC by root@m15c45> to /tmp/some

real	6m44.219s
user	13m38.107s
sys	0m47.877s
# time /tmp/restic -r /data/backup --verbose dump 46a169a5 /dev/drbd/by-res/one-vm-7994-disk-2/0 | dd of=/dev/null bs=64k status=progress
21457734025 bytes (21 GB) copied, 779.736037 s, 27.5 MB/s
321266+13230 records in
321266+13230 records out
21478596608 bytes (21 GB) copied, 780.566 s, 27.5 MB/s

real	13m0.569s
user	14m6.415s
sys	0m56.296s

But my sample is only 21G long and still not all the resources of node utilized on both first step (before actualy copying) and after it, what’s happening on these stages? Unfortunately option --verbose is not working for me.

Is there any chance to make dump operation faster in the foreseeable future?

Regards

There is initialization step that downloads repository index, builds blob->(file,offset) map and creates all directories. Although I don’t see how this can take 2 minutes for single file worth of 21G of data, nothing else happens before actual restore starts.

The actual restore loop is explained in https://github.com/restic/restic/blob/master/internal/restorer/doc.go. Current implementation uses 8 concurrent workers, which all write different parts of the file.

Dump is fundamentally sequential operation, I would not spend time optimizing it. I think direct support for backup and restore of block devices will be much simpler to implement.

FWIW, I only perf-tested restore with APFS locally and XFS in AWS (could have been EXT4, don’t recall for sure) and in both cases restore was able to get close to max network speed.

@ifedorenko, thanks for explanation.

8 workers is hardcoded value, right?

Well, I think that first phase is long due amount of backups I have repository where is 39T used space by the 1467 different snapshots. Was restic tested with such amount?

I’ve tried to create separated repository with just single snapshot (21G) and try to recover it:

# time restic -r /data/restic2/ restore 09603f81 --target /tmp/some2
repository 262160ee opened successfully, password is correct
restoring <Snapshot 09603f81 of [/tmp/some/dev/drbd/by-res/one-vm-5525-disk-0/0] at 2020-05-27 11:47:43.210701609 +0200 CEST by root@redacted> to /tmp/some2
real	11m28.562s
user	11m5.410s
sys	0m24.508s

the first phase pass immediately, but second one took 11,5 minutes to finish the recovery.

In my understanding it is not acceptable speed for recovery, can I somehow debug this behavior?

Yes, hardcoded, see restic/internal/restorer/filerestorer.go at master · restic/restic · GitHub

Just to confirm, this is 11.5 minutes to restore single 21G file, from a repository on a local filesystem to a target directory on another local filesystem, right? If this is correct, then yes, definitely not acceptable restore speed.

First, can you confirm you are using the latest master during restore? 0.9.6 has a bug that specifically affects large file restore performance, which would explain the behaviour you observe.

Beyond that, you need to profile restore and see what it does for 11.5 minutes. Go has decent profiler capabilities, so hopefully it’ll be obvious what’s going on from the profiler graphs. If you can share a test repository I can use to reproduce the problem locally, I may be able to find time to have a look too (no promise, though).

You’re right, the last test with recovery for 18 minutes was running on v0.9.6 from release page.
I repeated the test on v0.9.6-207-g84475aa3 from master and now recovery is seems ok:

image

I’m going to solve problem with slow initialization step by separating multirepo to multiple small repos for each backup target.

2 Likes