Server unresponsive during restic backups

paulschreiber · December 21, 2022, 3:28am

Restic incremental backups take a long time (17 hours). While it is running, my server is unresponsive to requests (primarily web requests to apache).

Config

Ubuntu Linux 22.04 (5.15.0-56 kernel)
Restic 0.14.0
24 GB RAM, two mirrored 1TB drives (SATA HDs not SSDs), btrfs filesystem
backup destination is Amazon S3
in a colo with a synchronous, fast connection via gigabit ethernet

I have a cron job that runs incremental backups once a week like this:
restic backup --exclude-file=/usr/local/restic/exclude / >> /var/log/restic.log

The exclude file contains:

/dev
/proc
/tmp
/mnt
/swapfile
/sys
/run
/var/lib/lxcfs

From the log:

Files:       68511 new, 149658 changed, 2547481 unmodified
Dirs:         9892 new, 26049 changed, 204325 unmodified
Added to the repository: 5.352 GiB (5.355 GiB stored)

processed 2765650 files, 328.790 GiB in 17:08:49

I have tried ionice, nice and both (i.e. ionice -c2 nice -n 19) to no effect.

How do I determine the cause of this slowdown/unresponsiveness so I can fix it? (Aside: to my untrained eye, 149658 seems like a high change count for one week.)

MelcomX · December 21, 2022, 7:35am

My best guess would be, that it uses much memory. I read once(maybe somebody here can confirm/correct this?), that memory usage is to some degree proportional to the count of files.

I think I would suggest the following things:

Does the log contain anything interesting?
Does the File change detection work? Could it be, that the inodes/ctime change, and thus all files are rescanned?

But we shouldn’t forget, that reading 150k at 1kB files takes significantly longer than reading 1 file with 150MB.

nicnab · December 21, 2022, 9:19am

Did you try htop and iotop to see what the machine is doing? I never backed up to S3 but maybe try backing up to a local repo just for testing?

paulschreiber · December 21, 2022, 12:54pm

@nicnab I had only tried top. Nothing unusual was going on with restic itself, but apache had spawned dozens of processes instead of the usual 12. Will try htop and iotop next week. I don’t have enough disk space to back up to a local destination.

@MelcomX The “log” file is just stdout and doesn’t show any errors or details. You can review the last few months of log data.

I tried using DEBUG_LOG and the amount of output was immense (a few GB). Given more direction, I am happy to re-enable debug loggin and try again.

I suspect file change detection works, since there are 2M unmodified files and 150K unmodified files. If change detection did not work, I would expect there to be only “modified” files.

MichaelEischer · December 23, 2022, 10:04pm

paulschreiber:

Files:       68511 new, 149658 changed, 2547481 unmodified
Dirs:         9892 new, 26049 changed, 204325 unmodified
Added to the repository: 5.352 GiB (5.355 GiB stored)

processed 2765650 files, 328.790 GiB in 17:08:49

At that amount of changes, I’d expect the backup to take between 5-15 minutes. The pastebin log shows that restic is able to prune the repository in a few minutes. That is somewhat curious as prune requires much more memory than backup. That is we can probably rule out the case that the server is swapping itself to death.

150k changed files are indeed a lot. You could use restic diff snapshot-a snapshot-b to get a list of files which have changed between the snapshots.

It looks like there is some serious problem with the IO performance. restic takes about a quarter of an second to read each file. Judging from the amount of data added by the backup, that is far too slow, even for a HDD. Are there any warning show in the kernel log by dmesg? Is the RAID in a healthy state?

paulschreiber · December 23, 2022, 10:47pm

@MichaelEischer Excellent hint!

Here’s the summary of the diff output:

Files:       68511 new, 94583 removed,  1071 changed
Dirs:         9892 new,  9413 removed
Others:       2149 new,  2139 removed
Data Blobs:   8529 new, 37864 removed
Tree Blobs:  35340 new, 34885 removed
  Added:   5.358 GiB
  Removed: 8.534 GiB

Looks like a lot of it (133k) is in /snap, which I have now added to my exclude list.

# grep -c /snap restic-diff.txt 
133705

Output of smartctl:

I suspect excluding /snap will help a lot. Curious as to what you think of the smartmontools output.

paulschreiber · December 25, 2022, 3:04pm

@MichaelEischer Here are screenshots of htop (filtered) and iotop:

paulschreiber · December 25, 2022, 3:13pm

Here’s some strace output from one of the child processes:

# strace -p 549951
strace: Process 549951 attached
restart_syscall(<... resuming interrupted read ...>) = -1 ETIMEDOUT (Connection timed out)
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
futex(0x18c7298, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=793301858}) = -1 ETIMEDOUT (Connection timed out)
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
epoll_pwait(3, [], 128, 0, NULL, 0)     = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
getpid()                                = 549950
tgkill(549950, 549959, SIGURG)          = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
futex(0x18c7298, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=996592347}) = -1 ETIMEDOUT (Connection timed out)
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
futex(0x18c7298, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=999518218}) = -1 ETIMEDOUT (Connection timed out)
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
futex(0x18c7298, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=999612353}) = -1 ETIMEDOUT (Connection timed out)
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
futex(0x18c7298, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=999292713}) = -1 ETIMEDOUT (Connection timed out)
nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=40000}, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=80000}, NULL) = 0
futex(0x18c7298, FUTEX_WAIT_PRIVATE, 0, {tv_sec=0, tv_nsec=999107358}) = -1 ETIMEDOUT (Connection timed out)

Is the sleep/timeout combo expected?

MichaelEischer · December 26, 2022, 10:43pm

Judging from the output, the disk is still in good shape, although probably not the fastest.

Go issues a lot of theses syscall, so yes that output is not too unexpected.

The following command might produce some more useful output:
strace -e trace=%file -tt -T -e 'signal=!sigurg' -p pid-of-restic that will focus on file related syscalls, ignore the SIGURGs used by the goruntime and also prints timestamps and durations for each syscall.

paulschreiber · January 1, 2023, 3:56pm

restic is running again now.

Good news is with the reduced diff set, it hasn’t overwhelmed the CPU — I’m not getting alerts about the web server being unresponsive. (I also tested manually and was able to load web pages).
Bad news:
- the backup is still taking a long time. Started at 7am; now 10:55am
- strace` with file operations produces very little output. See below.
Aside: is --verbose 3 useless? Do I need --verbose=3? I would expect more output.

top shows 35 only 35 minutes of CPU time:

top - 10:42:59 up 5 days, 53 min,  4 users,  load average: 6.94, 5.28, 4.69
Tasks: 236 total,   1 running, 235 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.0 us,  1.8 sy,  0.0 ni,  1.0 id, 95.2 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  24036.2 total,   1010.3 free,   3143.1 used,  19882.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  20355.8 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                             
 484918 root      20   0 1491536 273508   5876 S   3.7   1.1  35:09.09 restic

iotop

Total DISK READ:      1315.08 K/s | Total DISK WRITE:       737.39 K/s
Current DISK READ:    1325.27 K/s | Current DISK WRITE:    1699.06 K/s
    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                    
    316 be/4 root        0.00 B/s  679.63 K/s  ?unavailable?  [btrfs-transacti]
 224797 be/4 mysql       0.00 B/s   40.78 K/s  ?unavailable?  mysqld [ib_pg_flush_co]
 484927 be/4 root     1315.08 K/s    0.00 B/s  ?unavailable?  restic backup --verbose 3 --ex~ile=/usr/local/restic/exclude /
 499973 be/4 root        0.00 B/s   13.59 K/s  ?unavailable?  [kworker/u8:5-btrfs-endio]
 501998 be/4 www-data    0.00 B/s    3.40 K/s  ?unavailable?  apache2 -k start

htop:

root@green:/home/paul# htop
    0[||||||                                          9.6%]   Tasks: 145, 189 thr; 1 running
    1[||||||||                                       12.1%]   Load average: 7.82 6.27 5.17 
  Mem[|||||||||||||||||||||||||||||||||||||||||3.22G/23.5G]   Uptime: 5 days, 00:56:36
  Swp[                                               0K/0K]

    PID USER      PRI  NI  VIRT   RES   SHR S CPU%▽MEM%   TIME+  Command
 484918 root       20   0 1456M  267M  5876 S  4.3  1.1 35:18.38 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484925 root       20   0 1456M  267M  5876 D  2.1  1.1  5:47.13 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484927 root       20   0 1456M  267M  5876 S  0.5  1.1  5:17.08 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484917 root       20   0  7304  2368  2120 S  0.0  0.0  0:00.00 /bin/bash /etc/cron.weekly/restic-backup
 484919 root       20   0 1456M  267M  5876 S  0.0  1.1  0:42.40 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484920 root       20   0 1456M  267M  5876 S  0.0  1.1  0:00.00 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484921 root       20   0 1456M  267M  5876 S  0.0  1.1  0:00.00 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484922 root       20   0 1456M  267M  5876 S  0.0  1.1  0:00.00 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484923 root       20   0 1456M  267M  5876 S  0.0  1.1  4:25.76 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484924 root       20   0 1456M  267M  5876 S  0.0  1.1  0:43.62 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484926 root       20   0 1456M  267M  5876 S  0.0  1.1  4:34.27 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484928 root       20   0 1456M  267M  5876 S  0.0  1.1  0:00.01 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484929 root       20   0 1456M  267M  5876 S  0.0  1.1  5:12.51 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484930 root       20   0 1456M  267M  5876 S  0.0  1.1  0:00.00 restic backup --verbose 3 --exclude-file=/usr/local/restic/
 484931 root       20   0 1456M  267M  5876 S  0.0  1.1  5:15.75 restic backup --verbose 3 --exclude-file=/usr/local/restic/

strace is really slow going

~ $ sudo strace -e trace=%file -tt -T -e 'signal=!sigurg' -p 484918
strace: Process 484918 attached
10:27:03.199632 newfstatat(AT_FDCWD, "/etc/hosts", {st_mode=S_IFREG|0644, st_size=249, ...}, 0) = 0 <0.000028>
10:27:03.200258 newfstatat(AT_FDCWD, "/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=303, ...}, 0) = 0 <0.000134>

This was copied at 10:55, 22 minutes after the last line of output.

I’m now logging strace output to a file and will come back and read it later:
sudo nohup 2>&1 strace -e trace=%file -tt -T -e 'signal=!sigurg' -p 484918 &

paulschreiber · January 1, 2023, 5:48pm

The backup finished in 5 hours. Much faster than 23 hours, but still slow:

scan finished in 2161.426s: 2619024 files, 251.726 GiB

Files:        6087 new, 34192 changed, 2578779 unmodified
Dirs:          711 new,  5673 changed, 207430 unmodified
Data Blobs:   9791 new
Tree Blobs:   6014 new
Added to the repository: 8.224 GiB (8.225 GiB stored)

processed 2619058 files, 251.748 GiB in 5:10:52

The strace output from nohup didn’t give me anything useful:

# cat nohup.out 
strace: Process 484918 attached
11:57:55.495994 +++ exited with 0 +++

nicnab · January 1, 2023, 8:56pm

Okay so from my standpoint I can see the CPUs are pretty busy according to htop. The load of 5 suggests you could probably use a couple more CPUs at that time. What’s the load average without restic running?

I/O-wise you’re reading 1.3 megs/s (not a lot but also not too shabby) and you added 8.2 gigs to the repo which takes a net time of 1.8 hours at that speed. 5 hours gross time doesn’t sound too unrealistic to me under these circumstances.

paulschreiber · January 1, 2023, 9:12pm

htop is showing a lot less CPU usage right now:

    0[|||                                             3.6%]   Tasks: 110, 176 thr; 1 running
    1[||||||||                                       13.5%]   Load average: 0.43 0.33 0.40 
  Mem[|||||||||||||||||||||||||||||||||||||||||2.94G/23.5G]   Uptime: 5 days, 06:25:12
  Swp[                                               0K/0K]

Let me know if it’s helpful to gather CPU load data over an extended time period.

Lastly: was the --verbose 3 argument ignored? Treated as --verbose ? Or treated as level 3 verbose?

MichaelEischer · January 2, 2023, 6:41pm

It’s interpreted as --verbose and backup path 3. --verbose=3 or -vvv would be correct. Although, the highest verbosity level is actually only 2.

That is a lot IO-load for only 2 CPU cores. I don’t see an obvious reason why the IO should be that slow, but then again I don’t have an idea how fast the HDDs are. Can you ran a HDD benchmark, e.g. bonnie++ ?

Can you provide more information on which CPU you’re using? 2 CPU cores with 24 GB RAM is a somewhat unusual combination.

Oh, I’ve just noticed that 'trace=%file' also might needs quotes, once you add them you should see a lot of syscalls.

paulschreiber:

10:27:03.199632 newfstatat(AT_FDCWD, "/etc/hosts", {st_mode=S_IFREG|0644, st_size=249, ...}, 0) = 0 <0.000028>
10:27:03.200258 newfstatat(AT_FDCWD, "/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=303, ...}, 0) = 0 <0.000134>

Maybe that’s just sampling bias, but for me those syscalls take 13 microseconds, not 130. My system uses an NVME disk for storage, but that should only matter for the first access to a directory. But /etc/resolv.conf should be in memory anyways.

paulschreiber · January 3, 2023, 1:23am

It’s interpreted as --verbose and backup path 3. --verbose=3 or -vvv would be correct. Although, the highest verbosity level is actually only 2.

The man page shows:

  -v, --verbose n      be verbose (specify multiple times or a level using --verbose=n, max level/times is 3)

Which (a) says the max is 3 and (b) is why I got confused about the parameter value.

That is a lot IO-load for only 2 CPU cores. I don’t see an obvious reason why the IO should be that slow, but then again I don’t have an idea how fast the HDDs are. Can you ran a HDD benchmark, e.g. bonnie++ ?

Drives are WD Red 1TB NAS (5400 RPM SATA 6 Gb/s 64MB Cache / WD10EFRX).

Here’s the bonnie++ results:

# bonnie++ -d /tmp -u paul
Using uid:1000, gid:1000.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  2.00       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Name:Size etc        /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
green           47G  163k  90  177m  31 48.6m  18  693k  79 83.0m  17 229.5  39
Latency             88440us     162ms    1169ms   70554us     381ms     323ms
Version  2.00       ------Sequential Create------ --------Random Create--------
green               -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16     1   6 +++++ +++ 359983616  55     1  58 +++++ +++ 359983616  74
Latency              6757us     432us    6628us    8143us    4065us   32320us
1.98,2.00,green,1,1673192452,47G,,8192,5,163,90,181277,31,49770,18,693,79,84965,17,229.5,39,16,,,,,1113,6,+++++,+++,10507,55,15252,58,+++++,+++,13695,74,88440us,162ms,1169ms,70554us,381ms,323ms,6757us,432us,6628us,8143us,4065us,32320us

Can you provide more information on which CPU you’re using? 2 CPU cores with 24 GB RAM is a somewhat unusual combination.

The CPUs are Core 2 Duo 2.13GHz.

paulschreiber · January 8, 2023, 3:37am

@MichaelEischer Any suggestions based on the above results?

paulschreiber · January 8, 2023, 2:34pm

I ran strace with quotes around %file. This produced a lot of output:

There are a few instances where there are long pauses (4 minutes in this case) before opening a pack file:

09:28:29.703487 listxattr("/var/lib/mysql/yyyyyy/wp_options.ibd", 0x18f54b8, 0) = 0 <0.000045>
09:28:32.307470 listxattr("/var/lib/mysql/xxxxxx/wp_foundation_options.ibd", 0x18f54b8, 0) = 0 <0.000044>
09:32:53.288579 openat(AT_FDCWD, "/tmp/restic-temp-pack-3035356112", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = 9 <0.001008>

The --verbose=3 also produces a lot of output — 269MB and counting.

nicnab · January 8, 2023, 7:15pm

Sorry I can’t say much regarding the strace results but I think the backup speed seems okay in this setup. Regarding the two CPUs I can only guess: restic 14 compresses by default and always reads two files at once which probably keeps the system pretty busy. Can you try without compression and see if it gets better? One of the next versions will also allow to limit reading to one file at a time which probably makes sense in a lot of situations as well.

paulschreiber · January 11, 2023, 2:36pm

@nicnab If I have an older with v1. Compression isn’t enabled in v1.

I have upgraded to v2 and will see what that does with compression off and auto.

MichaelEischer · January 11, 2023, 9:30pm

I still don’t understand why strace reports syscall durations which are about a factor 10 higher than what I see on my system. Similarly, unlinking restic-temp-pack-2812182330 shouldn’t take more than one second…
Besides that I haven’t noticed anything suspicious.