Performance of prune

Hi,

I have a repository of about about 11TB which I recently ran forget on and now have about 500 snapshots left. Obviously I’d like to run prune, which had numerous performance-improvements as I noticed in the release notes. That’s why I upgraded to 0.13.1 before running it.

restic 0.13.1 compiled with go1.18 on windows/amd64

Now I noticed that ‘collecting pack’ takes a really long time, although it should be a relatively fast operation, if I understood that correctly.

Here’s the output of prune (ran with --max-repack-size=450G because of low space):

repository ccd6eb2e opened successfully, password is correct
loading indexes...
loading all snapshots...
finding data that is still in use for 488 snapshots
[6:00] 100.00%  488 / 488 snapshots...
searching used packs...
collecting packs for deletion and repacking

[5:17:22] 100.00%  2200426 / 2200426 packs processed...

to repack:        835992 blobs / 450.000 GiB
this removes:     613657 blobs / 399.568 GiB
to delete:       3379567 blobs / 4.195 TiB
total prune:     3993224 blobs / 4.585 TiB
remaining:       7642197 blobs / 6.020 TiB
unused size after prune: 1.481 TiB (24.60% of remaining size)

deleting unreferenced packs
[42:24] 100.00%  85381 / 85381 files deleted...
repacking packs
[4:17:00] 100.00%  91460 / 91460 packs repacked...
rebuilding index
[0:59] 100.00%  1312485 / 1312485 packs processed...
deleting obsolete index files

As you can see, collecting the packs took longer than actually processing 450GB of packs.

The computer has been mostly sitting idle at 0% CPU and was using about 1GB of RAM when I checked. Since this is a VM, I can’t see the disk usage. The repository is on a local NAS-Drive.

Any ideas, what I could try to improve that? Can someone give me a pointer on how to write a pprof profile?

Thanks a lot :slight_smile:

PS: I just found global_debug.go, so it seems like I need a debug-build to be able to generate profiles.

What backend are you using? Local? If so try to add -o local.connections=8 or something like this. Please note that it’s currently hard-limited to 8 in code. So if you want more (and machine is powerfull enough) you’ll additionally need to increase numRepackWorkers in internal/repository/repack.go

In any case I think root cause of such slow performance is insane 4MB pack size. So you have millions of 4MB files. Unfortunately currently there is no way to change it without building restic manually with followed pull request applied:

Another bad thing is that even with this pull request you’ll need to prune whole repo again to merge old 4MB packs to more large files. But good thing is that after this repo is still compatible with non-patched restic binary. So no need to ‘split’ to revert.

Anyway I suggest to postpone this conversion a bit or use version from master + that pull request and upgrade your repo to recently merged compressed. Compression also requires to prune everything. So you can save one prune cycle.

PS. changing pack size is officially unsupported. But I use it with restic for at least one year and more than happy with 128-256MB pack size.

1 Like

The repository is accessed via SMB from a Windows PC. The File-Server is a QNAP NAS (that’s also practically idle).

I’ll try the recommended -o local.connections=8 and report back as soon as the current prune-run is finished :slight_smile:

I’m following that PR as well and hope it gets merged soon. I might try the compression+pack size prune if the previous option isn’t acceptable.

Thanks for the help. :+1:

The step

basically only consists of listing the pack files (this equals doing a ls on the 256 subdirs under your repopath /data. Of course one of your problems is that you have so many pack files (due to the low size of each pack file) but listing shouldn’t take that long. In fact you are listing your files with a rate of ~115 files per second.

That might be part of the problem. You can try by yourself and manually do a ls on the mounted dir under /data/00. This multiplied by 256 should give you roughly the total time to list all pack files. (but note that your first run already removed quite some pack files, so listing and pruning should now be already faster)

If this is the cause for your long pruning times, then changing connections wouldn’t help. Instead you should find out why listing files over SMB is that slow…

2 Likes

I researched a little on the directory listing performance. It seems to be a problem of Samba, as the listing is at least 10x as fast when I do it locally via SSH. I unfortunately wasn’t able to fix the issue. For the moment I don’t need a prune and I hope the min-packsize improvement gets released before I do again.

For anyone interested, the only possible fix that I found was installing Qsirch on the QNAP which supposedly improves directory listings using Windows’ Search Index. The installation doesn’t seem to have changed anything though…

Anyways, thanks for your help.