Pack missing from index error trying to delete data

I have a massively ballooned 4.7 TB restic repository that has never been pruned in 16 months. After forget & prune it should be around 1.4 TB. I ran a forget --prune with a specific policy. It took 80 minutes to load indexes, snapshots, packs and it took around 12 hours to repack everything. Then it started on deleting the unneeded data, and it was on track to take 100 hours to delete everything.

Then it had some errors at around 8 hours of runtime and stopped. If I try to prune again, I get an error “Fatal: packs from index missing in repo”. I saw the command to repair and ran that, but it looked like it was going to re-pack everything back up and was going extremely slow, with an estimate of 100 hours to finish.

I just want to delete the unneeded files, prune won’t do it because it hits the packs missing error. Do I really need to run another extremely long job to repack all the deleted files to be able to turn around and re-forget them again? Or am I missing something? I can’t lock the repository for 100 hours because we need to backup every 24 hours, so my plan was to forget prune and cancel, running about 10 hours a day and chipping away at this all week long. But I guess that won’t work?

Do you still have the information which exact error occurred? Which restic version do you use and which backend? What is the exact command line you’ve used to run forget --prune? Please provide the full log printed by prune (the snapshot list from forget --prune isn’t relevant).

Did you already run restic repair index?

As long as prune complains about an error, you have to consider the repository to be damaged. There’s a significant chance that you won’t even be able to restore new snapshots!

Until the old repository is repaired, the best way forward is to take the old one offline by creating a new repository to have time to repair the old one. Create the new repo using restic --repo path/to/new init --copy-chunker-params --from-repo path/to/old, that will allow you later on to merge the repositories using restic copy.

You first have to repair the repository before you’ll be able to use prune. That may or may not work, only the final deletion phase (and the pre-cleanup) work incrementally. All other phases likely have to start from the beginning.

Oh, I almost forgot the most important part. Before trying anything to repair the repository, please run restic check (and keep the full output). That will tell in what way the repository is damaged.

I ran restic forget -r z:/ --keep-daily 14 --keep-monthly 18 --prune, which ran for about 19 hours total before it stopped. I didn’t save the original error, but I don’t think it was a restic error - I think it was a network or connection timeout. I wish I’d saved it.

This is the result of prune:

restic -r z:/ prune
enter password for repository:
repository 1a1c0dc6 opened (version 1)
loading indexes…
[0:02] 100.00% 107 / 107 index files loaded
loading all snapshots…
finding data that is still in use for 44 snapshots
[1:21] 100.00% 44 / 44 snapshots
searching used packs…
collecting packs for deletion and repacking
[7:02] 100.00% 328514 / 328515 packs processed
The index references 1 needed pack files which are missing from the repository:
Fatal: packs from index missing in repo

Here’s what restic check gives:

742021 additional files were found in the repo, which likely contain duplicate data.
This is non-critical, you can run restic prune to correct this.
check snapshots, trees and blobs
[5:19] 100.00% 45 / 45 snapshots
Fatal: repository contains errors

Hmm, the repository format is designed to not be damaged on network problems. So, I’d really have liked to know what happened. But what’s even stranger is that according to prune only a single pack file is missing. And according to your description prune was already in the final pack file removal phase. It shouldn’t matter at all when prune is interrupted at that point.

There’s the actual error message missing from the check log (it should warn about missing pack files). But you should be able to follow the repair steps from Troubleshooting — restic 0.16.3 documentation starting from the “repair index step”.

Which restic version do you use?

I’m using restic 0.16.3. Running restic repair index I’m currently on this:

reading pack files
[3:33:18] 3.18% 23604 / 742021 packs

It appears to me that it is reading all 4.7 TB of data to try to figure out what is missing? I can’t really afford to spend the amount of time it takes to do this. Every single Restic command runs stunningly slow

  • Where is the repo located, eg is it on a local LAN, on an S3 instance, on an SFTP server etc?
  • What sort of connection speed do you have between your client and the repo?
  • What restic backend are you using?

That’s why I’ve initially suggested to temporarily use a new repository (see the init command above). Until the index is repaired, there is NO guarantee that newly created snapshots can be restored.

Depending on the backend, you can speed up the index repair significantly by increasing the number of backend connections using -o <backend>.connections=20 (or even higher).