Unavailable index files

This post of mine addresses some of the concerns of using Glacier:

Basically, Glacier is not cost-effective for restic except in the case that you never delete any data, and your repository never suffers any kind of corruption. (If corruption happens, restic needs to scan every pack’s header to recover, which means every file under the data prefix will be partially read.)

For restic to work with Glacier, you must apply the lifecycle policy only to objects under the data/ prefix. If you transition anything else (such as indexes, as you’ve found) then restic will break. The other directories must not be transitioned because:

  • config: We need this to know the repository configuration.
  • index/: We need this to know what objects have already been uploaded, so we can avoid uploading duplicate data.
  • keys/: We need this to decrypt all other files, and to encrypt new data.
  • locks/: We need this to be able to lock the repository and read other locks; locks are short-lived and small, anyway.
  • snapshots/: We could possibly get by without this, but some commands might break. In particular, restic snapshots obviously will not work, and it would probably be a good thing to be able to inventory your repository.

If you do this, I don’t believe you can use parent snapshots (which restic does automatically – I’m not sure if you can disable this – ping @fd0) because restic would need to be able to read tree objects to see what files have changed. Without parent snapshots, this can be avoided, but backups will take longer.

So, here’s the summary:

  • You can only transition objects under data/.
  • Your repository must never become corrupt or you have to initiate an expensive retrieval on every transitioned object.
  • If you transition data/ then pruning is prohibitively expensive because every data/ object must be retrieved, plus you will have to pay pro-rated early deletion fees on any repacked/deleted pack that was uploaded less than 90 days ago.
  • Backblaze B2 costs only $0.001 more per GB-month of storage, at least $0.08 less per GB retrieved, and has none of these limitations… just use that instead.
1 Like