Split/hybrid/isolated setup with two disks: Backup though contents of data folder have been moved?

cdhowie · November 29, 2020, 6:07am

Anyone using GDA should be aware that there are substantial costs for retrieval. S3 pricing is not simple.

You’d probably set up a rule to transition S3 Standard objects to Glacier Deep Archive (GDA) after 1 day.

You upload to S3 Standard, which costs:
- $0.023 per GB-month (prorated).
- $0.005 per 1,000 files uploaded.
After one day, the uploaded files are transitioned to the GDA tier.
- Now they cost $0.00099 per GB-month (prorated).
- There is a $0.05 fee per 1,000 files transitioned.

So uploading 5,000 8MB files is going to cost you:

Approx. $0.03 in S3 Standard storage fees.
$0.025 in upload request fees.
$0.25 in fees for the transition to GDA.
Approx. $0.04 per month stored in GDA.

About $0.35 just for the upload. Not bad… but wait, there’s more.

Transitioning a file to the GDA tier is a 180-day commitment. If you delete a file or transition it to a different storage tier before it has been stored for 180 days, you are immediately charged the full prorated “early deletion” fee for the remainder of the commitment.

Standard retrieval of an object takes between 3-5 hours. For this retrieval tier, you are charged the following:

$0.10 per 1,000 files retrieved.
$0.02 per GB retrieved.

Bulk retrieval of an object takes between 5-12 hours. Charges are:

$0.025 per 1,000 files retrieved.
$0.0025 per GB retrieved.

For both retrieval tiers, this will create a special temporary “retrieved” object in the S3 Standard tier, meaning you will additionally be billed the S3 Standard storage rate of $0.023 per GB-month while that retrieved object exists.

In all cases, you are charged an additional $0.09 per GB of data downloaded from AWS (when you actually access this data with restic).

Let’s say your repository is 1TB. Storage in GDA is costing you $1 a month. Cool, right?

Now let’s say you need to retrieve all of that data. Restic packs are about 8MB in length, so 1TB / 8MB = 125,000 files, give or take. To be generous, let’s assume you can wait up to 12 hours so you use the bulk retrieval tier and you make the restored objects available for 1 day. Your charges for this retrieval are:

$3.125 retrieval fees (by file count).
$2.50 retrieval fees (by size).
$0.77 S3 Standard storage for the restored objects (1TB for 1 day).
$90 in egress traffic fees.
$0.05 for the GET requests to download the objects.

So $96.45 for the whole restore operation. This all scales linearly, so if your repository is 5TB you can expect to pay about $500 if you should ever need to restore the whole thing.

Now that’s a “simple” example. Real-world cases are a lot more complex with more unpredictable storage costs. If you need the data sooner, you have to pay even more for the GDA standard retrieval tier.

Let’s run Backblaze B2 now.

Storage for the 1TB repository costs $5/mo, which is 5x as much as GDA.
Retrieval costs $10 in egress fees plus about $0.05 in request fees, so $10.05 for the operation.

In this scenario, the cost to restore from B2 is 10% of the cost to restore from S3 GDA.

So, ultimately, it depends on how often you expect you need to restore. You can restore 10 times with B2 for the same costs as restoring once from GDA.

And then you pile on all of the caveats of using GDA (which are already documented on this forum), one of which is that you can never prune without transitioning everything from S3 GDA back to S3 Standard, which alone is expensive (retrieval fees, plus S3 Standard fees, plus any GDA early deletion fees).

In my opinion, GDA with restic is just not worth messing with. GDA has its uses for other things; restic does not play well with it and there are all sorts of pricing gotchas.