FAIL: restic stats --mode raw-data

SO I ran (fishshell)

for i in (restic snapshots | awk '{print $1}' | sed '/^\//d' | sed '/^$/d' | sed '/^ID/d' | sed '/^-/d'); restic cat snapshot $i; end

Which produced no errors at all. Is there another way ?

@foolsgold Please allow people time to answer, we all have work and family etc to manage besides trying to help our fellow restic friends :slight_smile:

You don’t need to file an issue about this. In part because it’s most likely due to a hardware problem/malfunction/bitrot/whatever, and in part because it’s nothing new - we see these errors now and then, and do what we can to find the cause. Again, it’s likely a hardware issue or other type of cause outside of restic.

Please stand by for further responses and suggestions!

obviously if it’s an error and it persists and it can’t be fixed, then it’s not just outside restic. IMO, there should be an easy way to at least detect which snaphots are having an issue, so those can be forgotten and pruned. Seems check just told me there were errors, but nothing useful besides that.

I’m not sure what you’re trying to say. If you have a perfectly good repository on a hard drive, and I punch a hole in some bits on that hard drive, so your repository becomes corrupt, is that not a cause outside of restic? If not, please explain your rationale, because I’m not getting it.

There are ways. But having a panic attack over it isn’t going to help. Wait for people to get you some pointers on what to do.

Does check complain about pack <...> not referenced in any index or pack <...> does not exist, before all those blob not found in index errors?

Which backend do you use to store your backup repository? Did you run check some time before? Did anything unusual happen to your restic runs lately? Do / How often do you run prune?

<edit>Just noticed your Github issue: Do you use AWS S3 or just an S3-compatible service? But I guess S3 should have enough provision to avoid data corruption. </edit>

You can probably find the damaged snapshots by running restic rebuild-index to repair the damaged index and then run restic stats --mode blobs-per-file which should complain about the snapshots with missing data (the mode is important here). You can also try out whether restic backup --force ... recovers some of the missing blobs. But if its an option for you to wait a bit with repairing the repository, then I’d like to try to debug this a bit further.

I agree that the means to repair damaged repository are a bit lacking, but that’s primarily because restic goes to great lengths to avoid that such errors even occur. It is up to now unclear whether the damaged repositories are the result of hardware/operating system errors or of a tiny bug in restic. It’s rather hard to analyze as these damaged repositories only show up every now and then, but without any clear leads yet.

1 Like

Thanks for responding.

there were only two types of errors:

  • tree 06dbf6c9: file “com.plexapp.plugins.library.db-2019-12-01” blob 88 size could not be found
  • tree 06dbf6c9, blob 33334fb0: not found in index

AWS (Wasabi)

I’ve not run check before. I only ran it because the stat to see how large my repo was failed

Some of the backups had failed because there was another running. I deleted the results from crontab so I can’t check

once per week. You can see details of what I do in this post: Auto-snapshots / smart backup rentention - #8 by foolsgold

See output from this command in this gist. I don’t want to flood the thread. It fails with a SIGSEGV error and a stacktrace.

FreeBSD luffy 12.1-RELEASE-p1 FreeBSD 12.1-RELEASE-p1 GENERIC amd64
restic 0.9.6 compiled with go1.13.7 on freebsd/amd64

Thanks for your help!

So the state of your backup repository is best described as mostly fine, except for the pack file that is missing from both the repository and the index. You could give restic rebuild-index a try to see whether some blobs miraculously show up again, but judging from the check output I doubt it.

Do you have the log output of the last prune run? Does it mention any invalid files?

Please run restic backup --force ... once for all backup sets, to ensure that these actually contain all blobs for all referenced files. That will at least ensure that future backups are not affected by the missing blobs (if the lost data chunk still exist in the backup source, it will actually repair the repository).

The crash look interesting. Apparently restic found the missing blob and crashed while assembling the error message xD . The crash is fixed by https://github.com/restic/restic/pull/2668 .

For what it’s worth, I also use WASABI. Every now and then I have encountered similar errors. An index rebuild usually fixed things; worst case a rebuild followed by a forced backup. Last week I got into a situation with the “blobs larger than available” and I wasn’t able to recover. Most, if not all, big problems with a restic and WASABI have been as a result of a prune.

I have changed my configuration so that I now rclone sync my NAS rectic backup (which never has issues) to WASABI and so far that has worked without issue.

but a full monthly backup would be independent of issues, or can those persist? I’ve actually already disabled prune and forget for my backups based on what I read here… Mostly I don’t delete much stuff anyway.

@foolsgold I not sure what you mean with “full […] backup” in reference to restic. Just a normal backup ... run or do you pass some special flags? By default restic uses the latest snapshot created which has the same backup paths as a starting point to avoid rescanning every single file.

In case the index of your repository refers to missing blobs or some snapshots are missing some data blobs, then future backups might be affected unless you’ve repaired the damage with rebuild-index and backup --force ... (for every backup set!).

Do you remember whether the prune command printed something unusual such as incomplete pack file?

I don’t remember seeing that. I think it was complaining that another process already had a lock on the db (the backup didn’t finish before the cronjob for the prune/forget kicked in)

Hmm, when prune complains about an already existing lock, it won’t modify the repository. So maybe the repository corruption already happened earlier? Do you still have the log output of the last few prune runs and if yes, could you check whether you notice any unusual log output in them?

I don’t :frowning: I deleted them

I run a backup / check daily, forget / prune once a week. On Mar 14 backup / check reported no errors. My Mar 15 backup / check report is missing, so that may have been the beginning of the trouble.

My Mar 15 forget / prune log produced

7 snapshots have been removed, running prune
counting files in repo
building new index for repo
[2:00:11] 100.00%  637887 / 637887 packs

incomplete pack file (will be removed): abc0577ecc347f1780fa3371b27799498d0c17e71eb77e9a037194c8d7d6bdaa
incomplete pack file (will be removed): abe612705c6d51b4540100adf77b0673dad355abc0147a3f0006f854c1b2d5a5
 ....
repository contains 637821 packs (3063321 blobs) with 3.036 TiB
processed 3063321 blobs: 0 duplicate blobs, 0 B duplicate
load all snapshots
find data that is still in use for 41 snapshots
[0:53] 100.00%  41 / 41 snapshots

found 2956149 of 3063321 data blobs still in use, removing 107172 blobs
will remove 66 invalid files
will delete 7685 packs and rewrite 3648 packs, this frees 44.556 GiB
[1:52:03] 100.00%  3648 / 3648 packs rewritten
 ....

On Mar 16 a backup / check produced:

Files:          63 new,   119 changed, 176872 unmodified
Dirs:            0 new,     3 changed,     0 unmodified
Added to the repo: 13.661 GiB

processed 177054 files, 2.621 TiB in 27:39
snapshot ff4983a1 saved
using temporary cache in /tmp/restic-check-cache-723633347
created new cache in /tmp/restic-check-cache-723633347
create exclusive lock for repository
load indexes
check all packs
pack 07f9e875: not referenced in any index
pack 04430c7e: not referenced in any index
pack 047a0f0c: not referenced in any index
....
59 additional files were found in the repo, which likely contain duplicate data.
You can run `restic prune` to correct this.
check snapshots, trees and blobs
error for tree caf20d03:
  tree caf20d03: file "The Fugitive.m4v" blob 140 size could not be found
  tree caf20d03, blob 52d680a7: not found in index
....
Fatal: repository contains errors

I followed this with a manual index rebuild, then a check which failed, then a forced backup, then a check which failed. I’m not sure at what point I got the message about the “blobs larger than available” but it was in these manual steps.

My conclusion is that WASABI isn’t suitable for a direct backup from restic but it works reliably with rclone, so I now rclone sync my local backup to WASABI that way.

This is why I didn’t respond to rawtaz’s comment about causes outside restic. A backup program should be able to recover and handle it as well as any other programs out there.

Did you ever try doing a restore to see if it truly was a problem ? I’d prefer providing the help I can to restic to get it fixed than migrating to rclone. But that’s a good alternative solution. Thanks for mentioning.

I wish check actually told me what snapshots had a problem, so I could just delete those so I can know if the whole backup was shot, or if it’s just some bad snapshots.

1 Like

Look, we’re not having a fight here. But please answer the following:

I personally wouldn’t use Wasabi either, it seems rather unstable to me. But I could be wrong.

Question though; Have you tried using your Wasabi backend from restic via the rclone backend in restic? That is, restic using rclone using Wasabi.

Have you personally encoutered any issues with Wasabi? I’ve been using Wasabi for multiple repositories for about a year and didn’t encounter a single issue. Furthermore I don’t think I’ve seen a single issue report related to Wasabi. I might be wrong though. :wink: