I have had an idea about how to improve restic, specifically how to reduce the churn of unused files when repacking after a prune. (See also pull request #1994 - Prune repack threshold).
My thinking is that repacking packs in the backup to remove unused blobs is similar to how a garbage collector works in a language like Java, and it suffers from similar problems that a lot of stuff dies young and needs to be removed, creating a lot of churn.
The fix in virtual machine garbage collector is to treat young and old objects separately. Restic could do the same for the files that it backs up and those that it moves into new packs when repacking as part of a prune & forget.
My suggestion is that when doing a backup. Restic should sort the files by creation time, and put the oldest files in separate packs from the youngest. (If that is too expensive, just separate them into 4 or so bins by age). A file that is more than a year old will probably be around for a long time, while one that is only a few hours old could well get deleted soon, so should share a pack with other young files.
My thinking is that if we had this feature, then a prune and forget would do much less I/O because most packs would only contain old files that don’t need repacking. Also the threshold feature would work better because a low threshold (eg 20%), would cause most young packs to be rebuilt while not affecting old packs that had one or two files removed.