Restic check hangs system

I have an Odroid HC2 (arm32) that I recently upgraded to an HC4 (arm64). It’s running Armbian / OMV5.

When I run:

restic check --read-data-subset=1/128

The system grinds to a halt at this point:

using temporary cache in /tmp/restic-check-cache-082606718
created new cache in /tmp/restic-check-cache-082606718
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs

After the check snapshots, trees and blobs starts, about 2-3 minutes later the whole system becomes unresponsive and grinds to a halt. Previously, this command would run without problem and take about 30m.

At this point I (with difficulty) kill the restic task, unlock the repository and keep going.

I have successfully run:

restic backup
restic rebuild-index
restic forget --prune --keep-daily 7 --keep-weekly 6 --keep-monthly 18

The last two have been slow in comparison to how they used to run, but they complete. This is surprising given that the HC4 is considerably more powerful and with more RAM than the HC2. All the other tools I use in the system are running without a hitch.

In upgrading the system I did a complete reinstall of the OS, and migrated to the arm64 version of restic.

Any thoughts or suggestions for further diagnosing?

Thanks,

Gabe

The usual suspect would be that restic uses enough memory to start swapping. Although that doesn’t sound too likely when the new system has more RAM that the old. forget --prune should also require a lot more memory to run than check. Does the kernel log any problems? Which restic version are you using? check creates a new temporary cache, maybe it is now located on a tmpfs?

Micheal - thanks for the tips.

I’m using restic 0.11.0 compiled with go1.15.3 on linux/arm64.

I tried moving the cache directory using the TMPDIR variable (to both the microSD card with the OS, and the spinning HDD with my data) and it crapped out at “load indexes” on both.

The kernel log reports these errors (I’m not sure how to read it).

Any suggestions?

Feb 6 14:28:08 nas kernel: [ 6197.076509] systemd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Feb 6 14:28:09 nas kernel: [ 6197.076521] CPU: 1 PID: 1 Comm: systemd Tainted: G C 5.10.12-meson64 #21.02.1
Feb 6 14:28:09 nas kernel: [ 6197.076524] Hardware name: Hardkernel ODROID-HC4 (DT)
Feb 6 14:28:09 nas kernel: [ 6197.076527] Call trace:
Feb 6 14:28:09 nas kernel: [ 6197.076538] dump_backtrace+0x0/0x200
Feb 6 14:28:09 nas kernel: [ 6197.076543] show_stack+0x18/0x68
Feb 6 14:28:09 nas kernel: [ 6197.076548] dump_stack+0xd8/0x134
Feb 6 14:28:09 nas kernel: [ 6197.076553] dump_header+0x44/0x1ec
Feb 6 14:28:09 nas kernel: [ 6197.076559] oom_kill_process+0x1f0/0x1f8
Feb 6 14:28:09 nas kernel: [ 6197.076561] out_of_memory+0x180/0x550
Feb 6 14:28:09 nas kernel: [ 6197.076567] __alloc_pages_slowpath.constprop.119+0x950/0x9f0
Feb 6 14:28:09 nas kernel: [ 6197.076570] __alloc_pages_nodemask+0x23c/0x298
Feb 6 14:28:09 nas kernel: [ 6197.076574] alloc_pages_current+0x84/0xf8
Feb 6 14:28:09 nas kernel: [ 6197.076578] __page_cache_alloc+0x88/0xb0
Feb 6 14:28:09 nas kernel: [ 6197.076581] pagecache_get_page+0x148/0x300
Feb 6 14:28:09 nas kernel: [ 6197.076583] filemap_fault+0x504/0x9a8
Feb 6 14:28:09 nas kernel: [ 6197.076589] ext4_filemap_fault+0x34/0x808
Feb 6 14:28:09 nas kernel: [ 6197.076594] __do_fault+0x3c/0x170
Feb 6 14:28:09 nas kernel: [ 6197.076598] handle_mm_fault+0xd10/0x1060
Feb 6 14:28:09 nas kernel: [ 6197.076603] do_page_fault+0x130/0x3a8
Feb 6 14:28:09 nas kernel: [ 6197.076606] do_translation_fault+0x50/0x60
Feb 6 14:28:09 nas kernel: [ 6197.076609] do_mem_abort+0x40/0xa0
Feb 6 14:28:09 nas kernel: [ 6197.076614] el0_ia+0x64/0xb8
Feb 6 14:28:09 nas kernel: [ 6197.076617] el0_sync_handler+0x78/0xb8
Feb 6 14:28:09 nas kernel: [ 6197.076620] el0_sync+0x158/0x180
Feb 6 14:28:09 nas kernel: [ 6197.076623] Mem-Info:
Feb 6 14:28:09 nas kernel: [ 6197.076633] active_anon:270625 inactive_anon:622412 isolated_anon:0
Feb 6 14:28:09 nas kernel: [ 6197.076633] active_file:273 inactive_file:1715 isolated_file:90
Feb 6 14:28:09 nas kernel: [ 6197.076633] unevictable:488 dirty:0 writeback:0
Feb 6 14:28:09 nas kernel: [ 6197.076633] slab_reclaimable:13472 slab_unreclaimable:15497
Feb 6 14:28:09 nas kernel: [ 6197.076633] mapped:2038 shmem:398861 pagetables:4053 bounce:0
Feb 6 14:28:09 nas kernel: [ 6197.076633] free:13885 free_pcp:1186 free_cma:92
Feb 6 14:28:09 nas kernel: [ 6197.076640] Node 0 active_anon:1082500kB inactive_anon:2489648kB active_file:1092kB inactive_file:6860kB unevictable:1952kB isolated(anon):0kB isolated(file):360kB mapped:8152kB dirty:0kB writeback:0kB shmem:1595444kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 679936kB writeback_tmp:0kB kernel_stack:12864kB all_unreclaimable? yes
Feb 6 14:28:09 nas kernel: [ 6197.076642] Node 0 DMA free:21872kB min:11252kB low:14064kB high:16876kB reserved_highatomic:0KB active_anon:255888kB inactive_anon:661552kB active_file:32kB inactive_file:1020kB unevictable:0kB writepending:0kB present:1045504kB managed:957836kB mlocked:0kB pagetables:3336kB bounce:0kB free_pcp:1812kB local_pcp:160kB free_cma:0kB
Feb 6 14:28:09 nas kernel: [ 6197.076652] lowmem_reserve: 0 2815 2815 2815
Feb 6 14:28:09 nas kernel: [ 6197.076660] Node 0 DMA32 free:33668kB min:33800kB low:42248kB high:50696kB reserved_highatomic:0KB active_anon:826612kB inactive_anon:1828096kB active_file:1492kB inactive_file:5856kB unevictable:1952kB writepending:0kB present:2963820kB managed:2883068kB mlocked:1952kB pagetables:12876kB bounce:0kB free_pcp:2932kB local_pcp:364kB free_cma:368kB
Feb 6 14:28:09 nas kernel: [ 6197.076667] lowmem_reserve: 0 0 0 0
Feb 6 14:28:09 nas kernel: [ 6197.076675] Node 0 DMA: 104kB (U) 1318kB (UME) 16016kB (UE) 4432kB (UE) 5364kB (UME) 11128kB (UME) 7256kB (UME) 2512kB (ME) 91024kB (UM) 02048kB 04096kB = 21888kB
Feb 6 14:28:09 nas kernel: [ 6197.076709] Node 0 DMA32: 2995
4kB (UEC) 13958kB (UEC) 45616kB (UEC) 9232kB (UEC) 564kB (E) 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB = 33700kB
Feb 6 14:28:09 nas kernel: [ 6197.076738] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Feb 6 14:28:09 nas kernel: [ 6197.076741] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=32768kB
Feb 6 14:28:09 nas kernel: [ 6197.076744] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Feb 6 14:28:09 nas kernel: [ 6197.076746] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=64kB
Feb 6 14:28:09 nas kernel: [ 6197.076748] 401450 total pagecache pages
Feb 6 14:28:09 nas kernel: [ 6197.076750] 0 pages in swap cache
Feb 6 14:28:09 nas kernel: [ 6197.076753] Swap cache stats: add 0, delete 0, find 0/0
Feb 6 14:28:09 nas kernel: [ 6197.076755] Free swap = 0kB
Feb 6 14:28:09 nas kernel: [ 6197.076757] Total swap = 0kB
Feb 6 14:28:09 nas kernel: [ 6197.076759] 1002331 pages RAM
Feb 6 14:28:09 nas kernel: [ 6197.076761] 0 pages HighMem/MovableOnly
Feb 6 14:28:09 nas kernel: [ 6197.076763] 42105 pages reserved
Feb 6 14:28:09 nas kernel: [ 6197.076764] 229376 pages cma reserved
Feb 6 14:28:09 nas kernel: [ 6197.076766] 0 pages hwpoisoned
Feb 6 14:28:09 nas kernel: [ 6197.076768] Tasks state (memory values in pages):
Feb 6 14:28:09 nas kernel: [ 6197.076770] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Feb 6 14:28:09 nas kernel: [ 6197.076804] [ 1571] 0 1571 1212 50 45056 0 0 blkmapd
Feb 6 14:28:09 nas kernel: [ 6197.076809] [ 1580] 0 1580 4973 613 61440 0 -1000 systemd-udevd
Feb 6 14:28:09 nas kernel: [ 6197.076815] [ 1610] 0 1610 5492 1031 118784 0 0 systemd-journal
Feb 6 14:28:09 nas kernel: [ 6197.076821] [ 1611] 0 1611 724 43 45056 0 0 rpc.idmapd
Feb 6 14:28:09 nas kernel: [ 6197.076827] [ 1623] 102 1623 6192 457 69632 0 0 systemd-network
Feb 6 14:28:09 nas kernel: [ 6197.076832] [ 1703] 103 1703 5372 621 81920 0 0 systemd-resolve
Feb 6 14:28:09 nas kernel: [ 6197.076838] [ 1706] 109 1706 1776 360 49152 0 0 rpcbind
Feb 6 14:28:09 nas kernel: [ 6197.076847] [ 2411] 0 2411 1752 369 53248 0 0 smartd
Feb 6 14:28:09 nas kernel: [ 6197.076853] [ 2412] 0 2412 1089 366 45056 0 0 alsactl
Feb 6 14:28:09 nas kernel: [ 6197.076858] [ 2415] 104 2415 1705 224 49152 0 -900 dbus-daemon
Feb 6 14:28:09 nas kernel: [ 6197.076863] [ 2416] 113 2416 1565 320 49152 0 0 avahi-daemon
Feb 6 14:28:09 nas kernel: [ 6197.076868] [ 2417] 0 2417 3854 448 69632 0 0 systemd-logind
Feb 6 14:28:09 nas kernel: [ 6197.076874] [ 2418] 0 2418 3102 269 61440 0 0 wpa_supplicant
Feb 6 14:28:09 nas kernel: [ 6197.076879] [ 2420] 0 2420 55034 835 77824 0 0 rsyslogd
Feb 6 14:28:09 nas kernel: [ 6197.076884] [ 2421] 0 2421 8002 2225 98304 0 0 networkd-dispat
Feb 6 14:28:09 nas kernel: [ 6197.076890] [ 2422] 113 2422 1468 72 49152 0 0 avahi-daemon
Feb 6 14:28:09 nas kernel: [ 6197.076895] [ 2438] 0 2438 50846 2092 147456 0 0 php-fpm7.3
Feb 6 14:28:09 nas kernel: [ 6197.076900] [ 2451] 0 2451 27680 2259 106496 0 0 unattended-upgr
Feb 6 14:28:09 nas kernel: [ 6197.076905] [ 2453] 0 2453 369523 8931 331776 0 -999 containerd
Feb 6 14:28:09 nas kernel: [ 6197.076910] [ 2458] 0 2458 8057 6283 102400 0 0 rpc.mountd
Feb 6 14:28:09 nas kernel: [ 6197.076915] [ 2464] 105 2464 3033 273 49152 0 0 chronyd
Feb 6 14:28:09 nas kernel: [ 6197.076921] [ 2488] 105 2488 1099 380 49152 0 0 chronyd
Feb 6 14:28:09 nas kernel: [ 6197.076927] [ 2517] 0 2517 3025 362 61440 0 -1000 sshd
Feb 6 14:28:09 nas kernel: [ 6197.076933] [ 2592] 0 2592 18841 2723 122880 0 0 omv-engined
Feb 6 14:28:09 nas kernel: [ 6197.076938] [ 2593] 33 2593 50838 1451 122880 0 0 php-fpm7.3
Feb 6 14:28:09 nas kernel: [ 6197.076943] [ 2594] 33 2594 50838 1451 122880 0 0 php-fpm7.3
Feb 6 14:28:09 nas kernel: [ 6197.076952] [ 2634] 0 2634 2681 330 61440 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.076958] [ 2642] 0 2642 23067 29 53248 0 0 rngd
Feb 6 14:28:09 nas kernel: [ 6197.076963] [ 2672] 0 2672 3996 212 57344 0 0 monit
Feb 6 14:28:09 nas kernel: [ 6197.076968] [ 2774] 65534 2774 7527 2894 94208 0 0 python3
Feb 6 14:28:09 nas kernel: [ 6197.076973] [ 2776] 0 2776 9379 709 114688 0 0 nmbd
Feb 6 14:28:09 nas kernel: [ 6197.076978] [ 2783] 0 2783 635466 15580 581632 0 -500 dockerd
Feb 6 14:28:09 nas kernel: [ 6197.076983] [ 2784] 0 2784 1957 271 45056 0 0 agetty
Feb 6 14:28:09 nas kernel: [ 6197.076989] [ 2785] 0 2785 2337 325 49152 0 0 agetty
Feb 6 14:28:09 nas kernel: [ 6197.076994] [ 2845] 0 2845 14136 1360 147456 0 0 smbd
Feb 6 14:28:09 nas kernel: [ 6197.076999] [ 2907] 0 2907 10655 416 73728 0 0 master
Feb 6 14:28:09 nas kernel: [ 6197.077004] [ 2909] 108 2909 10672 235 77824 0 0 pickup
Feb 6 14:28:09 nas kernel: [ 6197.077009] [ 2910] 108 2910 10684 488 77824 0 0 qmgr
Feb 6 14:28:09 nas kernel: [ 6197.077014] [ 2912] 0 2912 13309 822 131072 0 0 smbd-notifyd
Feb 6 14:28:09 nas kernel: [ 6197.077020] [ 2913] 0 2913 13311 871 131072 0 0 cleanupd
Feb 6 14:28:09 nas kernel: [ 6197.077025] [ 3145] 0 3145 137118 1813 118784 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077030] [ 3189] 0 3189 137118 1817 118784 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077036] [ 3215] 0 3215 173984 2327 139264 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077041] [ 3249] 0 3249 192769 2825 155648 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077046] [ 3276] 0 3276 27869 2006 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077051] [ 3277] 0 3277 28205 1973 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077056] [ 3317] 0 3317 155551 2327 122880 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077062] [ 3369] 0 3369 27853 1489 69632 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077067] [ 3383] 0 3383 155615 2848 126976 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077072] [ 3427] 0 3427 174464 3346 147456 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077077] [ 3471] 1000 3471 183728 2607 106496 0 0 filebrowser
Feb 6 14:28:09 nas kernel: [ 6197.077083] [ 3472] 0 3472 28205 1560 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077088] [ 3490] 0 3490 34955 2735 135168 0 0 portainer
Feb 6 14:28:09 nas kernel: [ 6197.077094] [ 3503] 0 3503 119037 2903 118784 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077099] [ 3525] 1000 3525 201 1 32768 0 0 darkhttpd
Feb 6 14:28:09 nas kernel: [ 6197.077105] [ 3531] 0 3531 27853 2019 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077110] [ 3556] 0 3556 27725 1546 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077115] [ 3587] 0 3587 28205 1978 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077120] [ 3620] 0 3620 155551 1837 122880 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077125] [ 3646] 0 3646 49 1 36864 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077131] [ 3654] 0 3654 49 1 36864 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077136] [ 3683] 0 3683 118685 2330 118784 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077141] [ 3693] 0 3693 49 1 36864 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077146] [ 3721] 0 3721 155615 2323 135168 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077152] [ 3726] 0 3726 49 1 36864 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077157] [ 3755] 0 3755 28205 1581 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077162] [ 3764] 0 3764 155967 2324 126976 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077167] [ 3772] 0 3772 27853 2013 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077173] [ 3837] 0 3837 49 1 36864 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077179] [ 3846] 0 3846 118685 2355 114688 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077184] [ 3892] 0 3892 49 1 36864 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077189] [ 3915] 0 3915 118685 2375 110592 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077194] [ 3976] 0 3976 28141 1976 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077199] [ 4011] 0 4011 49 1 40960 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077205] [ 4018] 0 4018 155551 1815 131072 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077210] [ 4135] 0 4135 118685 1816 114688 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077215] [ 4172] 0 4172 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077220] [ 4239] 0 4239 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077226] [ 4385] 0 4385 155551 1316 122880 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077231] [ 4606] 0 4606 28205 1510 73728 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077236] [ 4648] 0 4648 118685 878 110592 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077241] [ 4660] 0 4660 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077247] [ 4782] 0 4782 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077252] [ 4784] 0 4784 49 1 36864 0 0 s6-svscan
Feb 6 14:28:09 nas kernel: [ 6197.077257] [ 4827] 0 4827 173984 1316 139264 0 -500 docker-proxy
Feb 6 14:28:09 nas kernel: [ 6197.077262] [ 4869] 0 4869 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077267] [ 4870] 0 4870 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077272] [ 4944] 0 4944 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077277] [ 5143] 0 5143 28205 973 77824 0 -998 containerd-shim
Feb 6 14:28:09 nas kernel: [ 6197.077282] [ 5427] 0 5427 232585 46480 610304 0 0 AdGuardHome
Feb 6 14:28:09 nas kernel: [ 6197.077287] [ 5923] 0 5923 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077292] [ 5924] 0 5924 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077298] [ 5930] 0 5930 1332 63 49152 0 0 bash
Feb 6 14:28:09 nas kernel: [ 6197.077304] [ 5931] 0 5931 186023 2330 163840 0 0 coredns
Feb 6 14:28:09 nas kernel: [ 6197.077310] [ 6048] 0 6048 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077315] [ 6050] 1000 6050 570471 35454 786432 0 0 mono
Feb 6 14:28:09 nas kernel: [ 6197.077320] [ 6072] 0 6072 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077326] [ 6137] 0 6137 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077331] [ 6140] 1000 6140 2439 1341 53248 0 0 python3
Feb 6 14:28:09 nas kernel: [ 6197.077336] [ 6220] 0 6220 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077341] [ 6221] 1000 6221 1156843 26494 831488 0 0 jellyfin
Feb 6 14:28:09 nas kernel: [ 6197.077347] [ 6333] 0 6333 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077353] [ 6335] 0 6335 601 11 40960 0 0 bash
Feb 6 14:28:09 nas kernel: [ 6197.077358] [ 6345] 0 6345 409 12 36864 0 0 crond
Feb 6 14:28:09 nas kernel: [ 6197.077363] [ 6467] 0 6467 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077369] [ 6469] 1000 6469 1014153 24186 794624 0 0 Radarr
Feb 6 14:28:09 nas kernel: [ 6197.077374] [ 6526] 0 6526 868 16 49152 0 0 sleep
Feb 6 14:28:09 nas kernel: [ 6197.077380] [ 6565] 0 6565 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077385] [ 6568] 1000 6568 479 16 45056 0 0 jackett_launche
Feb 6 14:28:09 nas kernel: [ 6197.077390] [ 6598] 1000 6598 1160187 50723 1003520 0 0 jackett
Feb 6 14:28:09 nas kernel: [ 6197.077395] [ 6628] 0 6628 49 1 36864 0 0 s6-supervise
Feb 6 14:28:09 nas kernel: [ 6197.077400] [ 6633] 1000 6633 252997 43124 544768 0 0 qbittorrent-nox
Feb 6 14:28:09 nas kernel: [ 6197.077405] [ 6680] 1000 6680 34491 19817 323584 0 0 python3
Feb 6 14:28:09 nas kernel: [ 6197.077411] [ 6745] 0 6745 502 361 40960 0 -1000 watchdog
Feb 6 14:28:09 nas kernel: [ 6197.077416] [ 11071] 0 11071 4261 672 77824 0 0 systemd
Feb 6 14:28:09 nas kernel: [ 6197.077421] [ 11072] 0 11072 41865 767 90112 0 0 (sd-pam)
Feb 6 14:28:09 nas kernel: [ 6197.077427] [ 11314] 0 11314 3420 291 69632 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077432] [ 11446] 0 11446 3043 752 53248 0 0 bash
Feb 6 14:28:09 nas kernel: [ 6197.077439] [ 12632] 0 12632 13404 423 98304 0 0 nginx
Feb 6 14:28:09 nas kernel: [ 6197.077445] [ 12633] 33 12633 13495 505 98304 0 0 nginx
Feb 6 14:28:09 nas kernel: [ 6197.077451] [ 12634] 33 12634 13495 651 98304 0 0 nginx
Feb 6 14:28:09 nas kernel: [ 6197.077456] [ 12635] 33 12635 13495 651 98304 0 0 nginx
Feb 6 14:28:09 nas kernel: [ 6197.077461] [ 12637] 33 12637 13495 656 98304 0 0 nginx
Feb 6 14:28:09 nas kernel: [ 6197.077466] [ 12742] 108 12742 10749 521 81920 0 0 tlsmgr
Feb 6 14:28:09 nas kernel: [ 6197.077474] [ 13965] 0 13965 3420 562 69632 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077479] [ 14098] 0 14098 3043 700 49152 0 0 bash
Feb 6 14:28:09 nas kernel: [ 6197.077485] [ 14242] 0 14242 2678 362 53248 0 0 integrity-check
Feb 6 14:28:09 nas kernel: [ 6197.077490] [ 14243] 0 14243 2678 240 49152 0 0 integrity-check
Feb 6 14:28:09 nas kernel: [ 6197.077495] [ 14244] 0 14244 2296 16 49152 0 0 tee
Feb 6 14:28:09 nas kernel: [ 6197.077500] [ 14258] 0 14258 299607 122594 1110016 0 0 restic
Feb 6 14:28:09 nas kernel: [ 6197.077507] [ 14323] 0 14323 2932 161 53248 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077513] [ 14324] 998 14324 50860 1897 147456 0 0 php-fpm7.3
Feb 6 14:28:09 nas kernel: [ 6197.077518] [ 14349] 998 14349 50860 2010 147456 0 0 php-fpm7.3
Feb 6 14:28:09 nas kernel: [ 6197.077523] [ 14352] 0 14352 10710 71 73728 0 0 smtpd
Feb 6 14:28:09 nas kernel: [ 6197.077529] [ 14356] 0 14356 3062 333 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077535] [ 14366] 0 14366 3062 354 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077540] [ 14380] 0 14380 3062 325 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077546] [ 14381] 1000 14381 592 23 40960 0 0 duck.sh
Feb 6 14:28:09 nas kernel: [ 6197.077551] [ 14384] 0 14384 2932 170 53248 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077556] [ 14387] 0 14387 3062 270 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077562] [ 14391] 0 14391 2932 80 49152 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077568] [ 14395] 0 14395 2928 72 53248 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077574] [ 14396] 0 14396 510 92 49152 0 0 sh
Feb 6 14:28:09 nas kernel: [ 6197.077584] [ 14399] 0 14399 1127 131 40960 0 0 pgrep
Feb 6 14:28:09 nas kernel: [ 6197.077589] [ 14400] 0 14400 2909 208 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077594] [ 14402] 0 14402 2880 69 49152 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077599] [ 14403] 0 14403 2789 153 61440 0 0 mountpoint
Feb 6 14:28:09 nas kernel: [ 6197.077604] [ 14404] 0 14404 2870 184 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077609] [ 14405] 0 14405 2880 68 53248 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077614] [ 14407] 0 14407 510 91 40960 0 0 sh
Feb 6 14:28:09 nas kernel: [ 6197.077624] [ 14410] 0 14410 1127 94 40960 0 0 pgrep
Feb 6 14:28:09 nas kernel: [ 6197.077630] [ 14413] 0 14413 2842 350 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077636] [ 14416] 0 14416 2677 55 57344 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077641] [ 14417] 0 14417 2842 213 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077646] [ 14418] 0 14418 2842 297 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077651] [ 14420] 0 14420 510 82 36864 0 0 sh
Feb 6 14:28:09 nas kernel: [ 6197.077662] [ 14422] 0 14422 1127 114 40960 0 0 pgrep
Feb 6 14:28:09 nas kernel: [ 6197.077667] [ 14423] 0 14423 2193 41 40960 0 0 sshd
Feb 6 14:28:09 nas kernel: [ 6197.077672] [ 14424] 0 14424 2842 239 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077678] [ 14428] 0 14428 510 66 36864 0 0 sh
Feb 6 14:28:09 nas kernel: [ 6197.077683] [ 14429] 0 14429 2681 232 49152 0 0 cron
Feb 6 14:28:09 nas kernel: [ 6197.077693] [ 14431] 0 14431 703 62 36864 0 0 pgrep
Feb 6 14:28:09 nas kernel: [ 6197.077697] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-53.scope,task=restic,pid=14258,uid=0
Feb 6 14:28:09 nas kernel: [ 6197.077855] Out of memory: Killed process 14258 (restic) total-vm:1198428kB, anon-rss:490376kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1084kB oom_score_adj:0
Feb 6 14:28:09 nas kernel: [ 6197.224939] oom_reaper: reaped process 14258 (restic), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

A couple more things:

  • I’ve tried with the latest beta build, and similar issues
  • The task manager shows “kswapd0” as using 100% of CPU when it slows down
  • I’ve set GOGC=20 to try to reduce memory usage (which I’d done on the older system which had only 2GB of RAM, rather than the 4GB of this newer system).

The kernel reported that it ran out of memory, and it killed restic at some point: Killed process 14258 (restic) total-vm:1198428kB, anon-rss:490376kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1084kB oom_score_adj:0. At that point restic was using about half a GB of memory. Judging from the remainder of the log, you have lots of other processes running which already consume nearly all available memory.

As you’ve mentioned that restic fails while loading the repository indexes, I highly doubt that you’ve successfully run restic backup or any of the other commands before. Or at least those commands probably won’t work when you run them now. How much data and how many files does your backup contain?

That’s expected when you run out of memory. The kernel tries to provide enough memory for the programs by moving data between RAM and the sd card. And that’s the point at which everything slows down to a crawl.

Thanks. I’m almost 100% certain it’s backing up successfully. I think that (from logs below) the repository is about 800GB.

Just now, I mounted the backup and copied 2 files out of the latest snapshot which were modified after I started having problems, and they are fine.

I run daily incremental backups and the logs show new files successfully being added to the repository (–verbose=3), and the task completes without errors.

Similarly, “forget --prune” works fine (albeit takes longer than before).

CHECK, FORGET and PRUNE backup using restic on Tue Feb  2 16:49:27 2021

Applying Policy: keep 7 daily, 6 weekly, 18 monthly snapshots
snapshots for (host [nas], paths [/]):
keep 2 snapshots:
...

2 snapshots

removed <snapshot/bf04a20ae2>
removed <snapshot/904b92a98f>
removed <snapshot/c058a276cb>
removed <snapshot/3c6002fe87>
removed <snapshot/51d9663007>
removed <snapshot/0c2a0c7841>
removed <snapshot/8e05d5833d>
removed <snapshot/515c69a3fb>
[0:03] 100.00%  8 / 8 files deleted

counting files in repo
building new index for repo
[9:41:26] 100.00%  153917 / 153917 packs

repository contains 153917 packs (1650920 blobs) with 725.990 GiB
processed 1650920 blobs: 2849 duplicate blobs, 947.827 MiB duplicate
load all snapshots
find data that is still in use for 88 snapshots
[6:52] 100.00%  88 / 88 snapshots

found 1645444 of 1650920 data blobs still in use, removing 5476 blobs
will remove 0 invalid files
will delete 496 packs and rewrite 160 packs, this frees 3.509 GiB
[8:01] 100.00%  160 / 160 packs rewritten

To your point about other tasks using the other memory - I turned off all the dockers (which accounted for all the “extras”), and the restic check ran without a hitch or slowdown.

So, it seems you’ve in some way solved the problem for me. Thanks!

When I ran restic check It (quickly) peaked at 1.2GB of virtual memory, 500MB of physical memory and 14MB of shared. Is this reasonable?

I also ran restic backup (again without dockers running), and it peaked at 1.05GB of virtual memory and 310MB of physical memory.

Where do you think the problem may be? I haven’t changed the dockers in any way since upgrading the hardware - the new hardware is running the same apps / config as the old.

  • Could the HC4 version of Armbian or OMV5 be causing the problem? Memory management issue?
  • Could docker or one of the dockers for arm64 be causing the problem? Chewing up more memory than before?
  • Or the arm64 version of restic? Is its memory usage reasonable?
  • Or something else?

Is there a way to encourage restic to be more parsimonious with its memory use? I’ve already set export GOGC=20.

Thanks!

Gabe

Is this a clue? Maybe I don’t have ANY swap?

free -h
              total        used        free      shared  buff/cache   available
Mem:          3.7Gi       1.4Gi       1.2Gi        29Mi       1.1Gi       2.2Gi
Swap:            0B          0B          0B

UPDATE: Adding a swapfile to my HDD solved the problem!

I followed these instructions (How To Add Swap Space on Debian 10 | DigitalOcean) and it worked a charm.

64-bit systems tend to use more memory as pointers take 64bit instead of 32bit. But I wouldn’t expect an increase in memory usage by more than 10-20%. However, I have no numbers whatsoever to backup that gut feeling. I’m not aware of 64bit specific memory usage issues in restic.

Besides the GOGC environment variable there’s probably not too much which can be done to reduce the memory usage.