Restic and S3 Glacier Deep Archive

Hi all,

This is less a question rather than some thought sharing.

A few days ago, AWS announced a new S3 tier, called “Glacier Deep Archive”. As far as I can tell, it takes the existing Glacier product properties even further, most notably in terms of time-to-restore (comparison) and pricing. At $0.001 per GB-month, storage is an order of magnitude cheaper than what competitors such as B2 offer (restore prices are a different story, though). The downside, though, is that with restore times of up to several hours data isn’t available interactively.

Even if only the /data subdir is moved to Glacier, restic wouldn’t be usable for much more than just adding additional backups. For everything else (such as prune) data would need to be moved back to a standard S3 storage class first, which only makes sense after Glacier’s minimum storage duration of 180 days and costs the same as 2.5 months worth of storage. I see two possible scenarios based on restic’s current feature set:

  1. Append-only backups. Given low storage prices you’d just keep old data, even more so if data gets rarely deleted.
  2. A restore every > 180 days, prune, then move back. Ideally, you’d only restore those packs which were written > 180 days ago and then prune snapshots older than 180 days. Restic shouldn’t need access to any files younger than that, I haven’t tested this, though.

Possible scenarios based on extensions of restic:

  1. Restic could implement some sort of “lightweight” prune where it never validates or re-packs any existing data and only performs delete operations on packs where 100% of the contained blobs aren’t needed any more. Although this would result in some overhead (partially required packs), it would still free up some space compared to 1).
  2. An optimization based on the combination of 3) and 2): First, restic would “lightweight-prune” superfluous packs, then only restore the packs which are left && needed, perform a “proper” prune (incl. re-packing) and then move data back to the Glacier storage tier.

I think 1) or 2) might be valid approaches to consider for secondary backups where you expect to never need them. 3) and 4) probably wouldn’t warrant the effort – most likely there are better alternatives for such use cases. I’d be interested to know your thoughts on how this new offer could be leveraged in combination with restic!

We had the same conversation when the Glacier storage tier (compare to Glacier Deep Archive) was announced and it honestly isn’t a good fit for restic for several reasons:

  • The 180-day commitment makes pruning very difficult to do without causing early deletion fees. Simply pruning every 180 days doesn’t work because packs created one day ago could still need to be rewritten.
  • Unless parent snapshots are simply not used, restic needs to be able to fetch tree objects on-demand, which necessitates data packs being readily-available.
  • All of this added complexity restic brings does not provide a benefit in a coldline storage mechanism; simple differential or incremental backups are much more effective and easier to manage, and don’t come with a long list of caveats. The advantages of restic basically disappear when using coldline storage.
  • Restic, by design, is built to work on multiple storage backends as long as they all provide a certain minimum set of features. Supporting one storage vendor’s specific features adds code complexity for the benefit of only people who use that service, and in the Glacier/GDA case the amount of complexity would be incredibly high to prevent things like early deletions, for fairly minimal gain.

Edit: To give perhaps the best illustration of how unwieldy this would be, consider the workflow when you want to restore a single file.

  1. You ask restic to restore the file /a/b/c/d from a snapshot.
  2. Restic doesn’t have the requested snapshot’s root tree object so it goes to fetch it. This fetch fails because the containing pack hasn’t been restored to S3. Restic complains about the missing pack.
  3. You go to the S3 console and restore that pack, wait 12 hours for the data to become available, then re-run the restore.
  4. Restic is now able to look at the snapshot root directory and looks up the tree object ID for /a. This is in a different pack. You jump back to step 2, and repeat this process four more times until restic has the blob for the file.

It would take ~60 hours in the average case, assuming that all of the file’s data blobs are in the same pack!

If you have to restore a whole snapshot, you might as well issue a restore on all of the data objects in the repository, because you’re probably going to need ~50% of them anyway and it would take too long to manually trudge through this process.

Contrast this to a differential backup where you restore and download a maximum of two files and you’re done.

Thanks for your detailed response, @cdhowie!

I fully agree with you in terms of restoring. In that case, a full restore is probably the easiest (and only) option.

Based on your comments it looks like we have to rule out pruning entirely, so this basically leaves us with one-off backups or append-only backups in “blind flight” at best as the only options with Glacier/GDA. Although the same could be achieved using traditional tools such as tar + gpg I guess there’s still a use case for restic if you already use it with other backends and you don’t want to set up an additional workflow, keys, etc.

Right, so one has to weigh the cost of implementing a different backup system against the cost of having to initiate an S3 restore out of GDA for an entire repository any time anything needs to be restored from backup.

I’m not sure Glacier Deep Archive (GDA) should be used with Restic at all. I use both, for different purposes - Restic for versioned backups, GDA for archives. I outline how I use the Infrequent Access class of storage with Restic in this issue. This is still the way I plan to use Restic with S3.

I used to have my archives on Glacier, as opposed to in S3 in the Glacier tier. I’ve now moved those archives to S3 and the GDA tier. I use “aws s3 sync” command line for this, keeping a copy on my PC. GDA is great for archives, and may work with some other forms of backup software, but without significant changes I wouldn’t think it would suit dymamic block based archives like Restic.

AWS has announced a new storage tier called “Glacier Instant Retrieval”. It’s almost as cheap as the other Glacier tiers, but retrieval time is milliseconds - much better for backups like Restic. In the CLI you use the class GLACIER_IR

Restic doesn’t seem to support storage tiers, but adding support for this could be good. For now using S3 lifecycle rules to transition objects in the data folder to this storage class could save some money over S3 standard or IA class.

If not, then rclone will :slight_smile: See their docs

In some cases I backup with Restic locally then use the AWS CLI to upload using the storage tier I want, but on my web server I have Restic send data directly to S3 to reduce disk space. It’s the second scenario it would be useful to be able to specify storage tiers.

If the egress fees for AWS Glacier instant retrieval are same as other tiers , then this would be too expensive.

Testing a 1TB repository would cost around 100$ if I am not wrong, compared to near free egress with competitors.

Can someone clarify if egress fee could be avoided in AWS?

AWS does charge a lot for data egress, $0.09 per GB, which is $90 for 1TB. You also pay fees for the API requests, which typically don’t add up to much. My restore tests tend to be targeted to a small number of files, I don’t expect to ever have to retrieve a lot of data from S3 as I also have the data on local disks. For me it’s a last line backup. If you plan to test / retrieve large amounts of data then S3 is probably not the best option.

As a comparison BackBlaze B2 costs $0.01 per GB to retrieve, so $10 for 1TB. They cost $0.005 per GB for storage, compared with $0.004 per GB for S3 glacier instant retrieval. B2 is probably a better option for most.

I use AWS because I’m very experienced with it, I understand it well, and I don’t store all that much data in S3 that I don’t have somewhere else like a backup disk.

1 Like