Confusion about prune --max-repack-size=0

I am confused about prune --max-repack-size=0 (restic v0.16.4)… (Apologies if similar questions have been asked, I was unable to find any!)

Recovering from ‘no free space’ errors says:

“In most cases it is sufficient to instruct prune to use as little scratch space as possible by running it as prune --max-repack-size=0. […] Obviously, this can only work if several snapshots have been removed using forget before. This then allows the prune command to actually remove data from the repository. […]”

Eh? NO, it is NOT “obvious”, if I am understanding the intention correctly. I currently suspect (for reasons explained below) the above means (or perhaps would be better phrased as?):

“In most cases it is sufficient to instruct prune to use as little scratch space as possible by running it as prune --max-repack-size=0. […] This can only work if there is sufficient scratch space available. You may need to remove one or more snapshots, using forget, to create adequate scratch space. This then allows the prune command to actually remove data from the repository. […]”

(The redacted parts (“[…]”) may also need editing if my guess is broadly-correct. Also, whilst not said, I assume this “scratch space” must be within the REPO, and hence, within the containing partition/filesystem/cloud/whatever…)

The situation I ran into is the REPOv1 in question has largely filled up the available space; the REPOv1 is about 1.4TB on a 2TB device/filesystem/partition, leaving about 0.5TB free space available. Using the current “restic 0.16.4 compiled with go1.21.6 on linux/amd64”, after migrating in-place to REPOv2 (migrate upgrade_repo_v2), the subsequent prune --compression=max --repack-uncompressed ultimately drove the filesystem/partition containing REPOv2 out-of-space.

Thank You, restic did handle the situation gracefully, and I was able to recover an intact and still-functional REPOv2 (largely by, if my memory is correct, a simple prune --compression=max on that “overfull” newly-created REPOv2).

But… There was clearly loads of “scratch space” available, albeit insufficient for the presumed-“duplicated” compression max version. At the present time I do not find the documentation clear on this point, but the impression that I currently have is --max-repack-size=0 “encourages” restic v0.16.4 to compress and then “commit” (index?) each snapshot individually, rather than “queuing them up” for “committal” at about the end of the prune --repack-uncompressed. (I am aware I am very probably not using restic terminology here! (If it helps, think git(1) terminology.))

That is, my current guess is that without --max-repack-size=0, then prune --repack-uncompressed writes the (fully-)compressed packs, etc., into REPOv2, but does not remove the (now-obsolete) REPOv1 packs (and indexes, etc.), until the Very End. This is Very Understandable, but I currently am finding the documentation lacking, or at least confusing. (In addition, at the moment, I have not run any additional tests/experiments to try and confirm my guessing… in part to my insistence on always have one proved-correct (and up-to-date) backup at all times.)

Perhaps the current documentation needs revisiting…?
Clarifications and corrections are most welcome!

Edit: Improved typography (as per @rawtaz useful correct and helpful suggestion).

It would be great if you didn’t write in CAPS so much. It comes across as shouting, and it also makes a reader pause on the CAPITALIZED word, causing a staggered reading flow (or rather lack of flow). Using emphasis is likely a better approach (the fact that you want to point something out with the emphasized word comes across clearly) and easier for the readers of your message to digest :slight_smile:

2 Likes

Yeah, actually I know that… Sorry!
Simplified explanation: Long posts are usually composed offline in a proper text editor, and I was uncertain what markup language this Forum uses, or if it was possible to upload pre-markedup text. So I resorted to an ancient markup-free ASCII convention from *c.*40yrs ago.

Prune does work differently than you assume. The basic cycle is not influenced by --max-repack-size. Prune first scans all snapshots for data that still must be kept, then plans which pack files can be removed and which need to be rewritten first as they still contain used data. --max-repack-size strictly limits how many files will be rewritten. Then restic processes the to be rewritten pack files and moves the data that should be kept to new pack files. Note that so far nothing has been deleted. Then the repository index is updated and only afterwards the selected pack files get deleted. Pack files cannot be removed before the index has been updated, as this could corrupt the repository otherwise.

That is, when rewriting a large repository, prune may require a very large amount of scratch space. --max-repack-size actually prevents restic from compressing any pack files at all, it will just remove pack files that are completely unused.

Ah… How then would one convert, in-place, a 1.5TB 100+ snapshot REPOv2 which is (mostly) uncompressed into a (max-)compressed repo?

I do have available 4TB storage which should allow a max-compression using restic copy, which is my (untested) fallback plan, but is there an inplace mechanism ?

Just use --max-repack-size 200G or similar to repack the repository in multiple chunks, this will require running prune a few times until everything is processed.

1 Like