Unavailable index files

tristan · January 22, 2019, 8:53pm

I’ve managed to get myself into a situation where a large number of my index files are unavailable, however the data is all correctly present.

What is the best way to recover this scenario, ideally without re-uploading all the data files?

cdhowie · January 22, 2019, 8:54pm

A simple restic rebuild-index should resolve this problem.

tristan · January 22, 2019, 8:56pm

Awesome - I’ll give that a try. Thanks for your swift reply!!!

tristan · January 22, 2019, 9:01pm

Okay, so in this particular case the data has gone to Glacier - so it’s there, but not available. So my rebuild-index is failing because it’s trying to read the data in.

D.O.Scott · January 22, 2019, 9:58pm

AWS provides a procedure:

https://docs.aws.amazon.com/AmazonS3/latest/user-guide/restore-archived-objects.html

The general nuisance and possible costs probably outweighs the savings in archiving to Glacier.

tristan · January 22, 2019, 10:03pm

Yes. So my usage of restic is really for Disaster Recovery. I plan to, hopefully, never need to restore from Glacier because I know it will cost me.

In theory I can move data files to Glacier using a storage policy as I shouldn’t need to read them in, unless something bad has happened.

So I now cannot rebuild the index and I’m missing the index files on S3. Actually the index files exist in the cache, but they have s slightly different folder structure.

So ideally I’d like to recreate these index files from the cache.

I was also wondering what would happen if I delete the snapshots that those indexes are referred to, and then recreate those backups - will it detect that I already have data files (albeit not readable) stored and therefore not worry about re-uploading?

cdhowie · January 22, 2019, 10:27pm

This post of mine addresses some of the concerns of using Glacier:

Basically, Glacier is not cost-effective for restic except in the case that you never delete any data, and your repository never suffers any kind of corruption. (If corruption happens, restic needs to scan every pack’s header to recover, which means every file under the data prefix will be partially read.)

For restic to work with Glacier, you must apply the lifecycle policy only to objects under the data/ prefix. If you transition anything else (such as indexes, as you’ve found) then restic will break. The other directories must not be transitioned because:

config: We need this to know the repository configuration.
index/: We need this to know what objects have already been uploaded, so we can avoid uploading duplicate data.
keys/: We need this to decrypt all other files, and to encrypt new data.
locks/: We need this to be able to lock the repository and read other locks; locks are short-lived and small, anyway.
snapshots/: We could possibly get by without this, but some commands might break. In particular, restic snapshots obviously will not work, and it would probably be a good thing to be able to inventory your repository.

If you do this, I don’t believe you can use parent snapshots (which restic does automatically – I’m not sure if you can disable this – ping @fd0) because restic would need to be able to read tree objects to see what files have changed. Without parent snapshots, this can be avoided, but backups will take longer.

So, here’s the summary:

You can only transition objects under data/.
Your repository must never become corrupt or you have to initiate an expensive retrieval on every transitioned object.
If you transition data/ then pruning is prohibitively expensive because every data/ object must be retrieved, plus you will have to pay pro-rated early deletion fees on any repacked/deleted pack that was uploaded less than 90 days ago.
Backblaze B2 costs only $0.001 more per GB-month of storage, at least $0.08 less per GB retrieved, and has none of these limitations… just use that instead.

tristan · January 22, 2019, 10:31pm

This matches my experience 100%. Backblaze isn’t available in EU last time I looked. Obviously I’d like to go back in time and not lose those index files - that was my mistake.

Do I have a way forward?

cdhowie · January 22, 2019, 10:32pm

How far did restic rebuild-index get? Did it fail before or after deleting the old index files?

tristan · January 22, 2019, 10:36pm

counting files in repo
Load(<data/000192eacd>, 591, 4509991) returned error, retrying after 654.24352m
s: The operation is not valid for the object’s storage class
Load(<data/00031c25b3>, 591, 4414070) returned error, retrying after 359.640614
ms: The operation is not valid for the object’s storage class
Load(<data/0004295549>, 591, 4893681) returned error, retrying after 365.994097
ms: The operation is not valid for the object’s storage class
Load(<data/0001a288a2>, 591, 4557831) returned error, retrying after 399.862166
ms: The operation is not valid for the object’s storage class
Load(<data/0005491ff9>, 591, 5262880) returned error, retrying after 436.920009
ms: The operation is not valid for the object’s storage class
Load(<data/0006d6117b>, 591, 4651550) returned error, retrying after 395.41873m
s: The operation is not valid for the object’s storage class

…

cdhowie · January 22, 2019, 10:40pm

Duh, of course it wouldn’t have deleted the old indexes since it couldn’t fetch the data packs to rebuild them. Silly me.

Modify your lifecycle transition policy so that it only applies to the data/ prefix.
Transition all objects under the other prefixes back to S3. This isn’t as simple as it sounds – you have to initiate an object restore, then once the object is available, perform an S3 “copy” operation to replace each object with itself. This will effectively return the object to S3 standard storage permanently. (This step might be easier performed by requesting the restore, then using rclone to copy these directories to a local machine, delete them from S3, then rclone them back the other way.)

Side note: if you’re never deleting anything, you might consider a more traditional approach, such as tar incremental backups encrypted with GPG. If you ever want to restore anything from your repository, you won’t know exactly what packs must be restored before you can recover the file you want, for example. You’ll have to restore everything anyway. Basically, nearly all of the advantages of restic are lost when using Glacier.

tristan · January 22, 2019, 10:52pm

Step 1 - done.
Step 2 - also done, aside from these 20 or so index files which are lost forever (but are in the cache)

cdhowie · January 22, 2019, 11:49pm

None of them should be lost forever unless you deleted them from S3. Didn’t you say they were unavailable because they had transitioned to Glacier?

tristan · January 23, 2019, 9:16am

Funny that. The data is transitioned to S3. The index files got deleted. Human error that one

fd0 · January 23, 2019, 9:16am

This is what restic will fall back to (not using a parent snapshot it none could be found or data could not be loaded). Much of the metadata from the repository will also be present in the local cache, so as long as you preserve it restic may work just fine.

In summary, restic is not built to be and not meant to be used with Glacier, albeit it may still work (somewhat at least).

cdhowie · January 23, 2019, 11:45am

In that case you should probably restore all of the data to S3 for one day and run rebuild-index after the transition completes. Hopefully the restoration fees are not too bad…

tristan · January 23, 2019, 11:58am

Thanks for not saying “I told you so”…

I think it’s likely the restore fees would be higher than re-uploading - which isn’t the end of the world. It only took a week or so.

Question: If I delete the snapshots that contain those indexes, can I just re-backup those snapshots, or would this cause further corruption?

cdhowie · January 23, 2019, 12:02pm

Well… I didn’t tell you not to delete them.

Snapshots don’t contain or reference indexes. If you mean deleting all snapshots that refer to data that was indexed in a deleted index file, that’s going to be tricky because of deduplication. It’s likely that the deleted index files together indexed at least one object in every snapshot, meaning you’d probably just have to toss the whole repo and start over.

Possibly. Don’t forget the early deletion fees if you start over, though…

Side note: If you start over, consider transitioning data/ objects to something like S3-IA instead of Glacier. Restic will work perfectly in that case but shouldn’t access any of the data objects unless you run a command like prune, check, rebuild-index, etc. (IA still has early deletion fees, but if you’re never intending to run prune then that’s not a problem.)

tristan · January 23, 2019, 12:48pm

Any option to upload them from the cache?

Different question: Are indexes deterministically created - i.e. if I run the same backup again will the same indexes get created?

fd0 · January 23, 2019, 2:07pm

That’s possible, the cache contains byte-identical copies of the files in the repo. It’s the simplest thing I could think of , so you can just sync the files from the cache to s3.

This assumes that the files in the cache are valid though.

Ah, and please be aware that restic automatically removes files from the cache that have been remove from the repository…