Restic and S3 Glacier Deep Archive

We had the same conversation when the Glacier storage tier (compare to Glacier Deep Archive) was announced and it honestly isn’t a good fit for restic for several reasons:

  • The 180-day commitment makes pruning very difficult to do without causing early deletion fees. Simply pruning every 180 days doesn’t work because packs created one day ago could still need to be rewritten.
  • Unless parent snapshots are simply not used, restic needs to be able to fetch tree objects on-demand, which necessitates data packs being readily-available.
  • All of this added complexity restic brings does not provide a benefit in a coldline storage mechanism; simple differential or incremental backups are much more effective and easier to manage, and don’t come with a long list of caveats. The advantages of restic basically disappear when using coldline storage.
  • Restic, by design, is built to work on multiple storage backends as long as they all provide a certain minimum set of features. Supporting one storage vendor’s specific features adds code complexity for the benefit of only people who use that service, and in the Glacier/GDA case the amount of complexity would be incredibly high to prevent things like early deletions, for fairly minimal gain.

Edit: To give perhaps the best illustration of how unwieldy this would be, consider the workflow when you want to restore a single file.

  1. You ask restic to restore the file /a/b/c/d from a snapshot.
  2. Restic doesn’t have the requested snapshot’s root tree object so it goes to fetch it. This fetch fails because the containing pack hasn’t been restored to S3. Restic complains about the missing pack.
  3. You go to the S3 console and restore that pack, wait 12 hours for the data to become available, then re-run the restore.
  4. Restic is now able to look at the snapshot root directory and looks up the tree object ID for /a. This is in a different pack. You jump back to step 2, and repeat this process four more times until restic has the blob for the file.

It would take ~60 hours in the average case, assuming that all of the file’s data blobs are in the same pack!

If you have to restore a whole snapshot, you might as well issue a restore on all of the data objects in the repository, because you’re probably going to need ~50% of them anyway and it would take too long to manually trudge through this process.

Contrast this to a differential backup where you restore and download a maximum of two files and you’re done.