AWS S3 Intelligent-Tiering

mvance · May 18, 2019, 1:00am

Anyone been brave enough to try the relatively new Intelligent-Tiering feature of S3 with your Restic repo?

tomwaldnz · May 18, 2019, 6:38am

Haven’t tried it, but Intelligent Tiering should work fine. I still just use policy to move my data to IA storage class, as it does exactly what I want.

fichtennadel · May 18, 2019, 11:47am

yes, me .Works fine.

mvance · May 18, 2019, 12:13pm

Thanks for the feedback. I typically use policy for my other S3 buckets as well. However, tiering based solely on age seems less suitable for this use case because of how the read/write/delete pattern works within a Restic repo.

I thought intelligent-tiering may help ensure the higher requests cost (GET, PUT, COPY, POST, etc.) of the IA tier does not result in higher cost than any savings associated with the lower per GB storage cost.

cdhowie · May 18, 2019, 12:28pm

Beware that IA has a 30-day minimum storage duration for objects.

S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA storage are charged for a minimum storage duration of 30 days, and objects deleted before 30 days incur a pro-rated charge equal to the storage charge for the remaining days.

IA is only a good solution if you rarely/never prune.

mvance · May 18, 2019, 8:12pm

Agreed. That’s part of why I’m curious if intelligent-tiering might make sense.

For a small monthly monitoring and automation fee per object, S3 Intelligent-Tiering monitors access patterns and moves objects that have not been accessed for 30 consecutive days to the infrequent access tier. There are no retrieval fees in S3 Intelligent-Tiering. If an object in the infrequent access tier is accessed later, it is automatically moved back to the frequent access tier. No additional tiering fees apply when objects are moved between access tiers within the S3 Intelligent-Tiering storage class.

cdhowie · May 18, 2019, 8:20pm

Sorry, I misspoke. Intelligent-Tiering also has the 30-day minimum (see the quoted part of my message).

I don’t believe it makes sense to use this storage tier given how restic deletes files during prune, and does not care how long the file has been stored.

mvance · May 18, 2019, 8:32pm

You could be right about it not making sense. I understand the 30-day minimum would still apply. If I understand this right, it’s more than how long the files have been stored at play here.

I think it would ultimately depend on how frequently you prune, what files change during that process, etc. If the auto-tiering is truly intelligent, files shouldn’t automatically be moved to IA because it would be frequently changing and thus not a good candidate for tiering down.

The question is whether or not enough files would remain untouched long enough to make paying the Intelligent-Tiering worth while (i.e.,would the savings for the files that qualify outweigh the per object monthly monitoring and automation fee).

I imagine the only way to really know is to run two S3 repos in parallel for a couple of months, keeping everything else equal (same files, same backup frequency, same prune schedule, etc.) and compare charges.

cdhowie · May 18, 2019, 10:36pm

Yeah, it does depend on how often you prune (and restore, though restores should be rare anyway).

One trick is that prune does access all packs, which IIRC will kick them back into the frequent-access tier. I think any prune or rebuild-index will bring all packs back into the frequent-access tier for some amount of time.

TBH, I’d consider storing on B2 before I considered a non-standard S3 tier. It’s cheaper than both standard and IA for both storage and egress, and has no minimum storage durations. (The only case where it’s cheaper is for egress when restoring into AWS, but B2’s egress is 1/9th the standard AWS egress rate which is insignificant considering how rare restores should be.)

mvance · May 18, 2019, 11:08pm

Thanks for the extra info and tip. I had never seriously considered B2 before. I knew it was popular here but wasn’t sure why. It makes more sense now.

From a quick search, it looks like the B2 durability SLA is similar to S3 even though it doesn’t spread data across multiple data centers (availability zones) like S3 does. It appears to only shard the data across multiple storage arrays (pods) in a single location.

One potential B2 downside for me is latency. I’m much closer to an AWS region than I am to a Backblaze data center.

I’ve had pretty good luck with Google Cloud Storage in the past (different from Google Drive Storage). B2 is still probably cheaper so o understand why many people may prefer it.

Another one I’ve heard that has good pricing is Wasbi, but I’ve never looked into it.

cdhowie · May 19, 2019, 1:15am

Indeed. I have on-site copies as well and use rclone to sync them to B2 for disaster recovery. This way, a B2 failure does not affect me unless the on-site copy fails at the same time. A B2 failure will be repaired by the next sync.

This can definitely justify S3’s higher rates! It’s a trade-off.

Wasabi works out to be $0.0009 per GB-month more than B2 but has no egress fees, which means it depends how much you intend to restore. If you restore the entire repository once a year or less frequently, then B2 works out to be cheaper. If you only do partial restores (restore a specific snapshot) then the break-even point moves out further and B2 becomes cheaper.

Wasabi also has a minimum storage duration of 90 days per object, which runs into the same problems as S3’s IA/intelligent tiers, only it’s three times as bad. Additionally, there is a 1TB per month minimum charge, so if your repository is less than 1TB you will be overcharged.

jeffhallam · October 21, 2019, 9:59pm

Would someone mind confirming my syntax for use of the s3.storage-class option to the restic backup command:

I used /usr/local/bin/restic backup -o s3.storage-class=INTELLIGENT_TIERING "$BACKUP_DIR"