Anyone using Oracle Cloud Archive storage?

mbrijun · April 21, 2025, 1:00pm

I am curious to know if anyone here is using Oracle Cloud Archive storage for their “cheap and deep” cold storage? It is 2.5x the price of AWS or Google’s cold storage, BUT they do not charge any egress fees for up to 10TB a month, while offering eleven nines durability.

fronesis47 · April 21, 2025, 6:37pm

I’m sorry, I don’t have the answer you are looking for. My guess is that no one has tried this, because of questions like the following:

How do you plan to connect to Oracle Cloud Storage? Restic doesn’t have a backend, so I’m guessing you’ll use the S3-compatible?
If so, how will you tell restic you are connecting to the archive tier and not the hot tier?
Also, restic has limited, experimental support for AWS cold storage, but will that even work for Oracle? It looks like you need to specify a glacier tier.
How does egress work? “Free” sounds nice, but I looked at the pricing page and it seems to want you to send them a physical hard drive through the mail. ??

mbrijun · April 21, 2025, 7:16pm

Hi @fronesis47 ,

My plan is to use rclone to bridge the gap. I was following this tutorial: link

Rclone has an option “–oos-storage-tier”

So far, I have created a storage bucket which defaults to “archive” objects. Specifying “-oos-storage-tier archive” is a must, as otherwise it defaults to the standard tier and results in an error. I have not tried restoring anything yet.

There is a big, expensive “gotcha” with the mainstream public cloud providers - their storage is cheap, but once you restore the data and want to download it back on your workstation or server, they charge you a very hefty fee for “egressing” the data from their environment to the public internet. There is this comparison I found link. Critically, Oracle allow egressing 10TB before they start charging you anything. However, they charge more for storing the data.

OVH and Scaleway are the two “reasonably priced” contenders, but the durability of their backups is probably not as good as Oracle’s.

fronesis47 · April 22, 2025, 5:53pm

Ah, this is super interesting and very helpful!

It sounds like you’ve got a pretty good plan in place, though from the lack of responses, you may be the first to try it. The one bit I’m still not clear on is egress pricing. Your chart and that link say it’s free, but when I look it up on the oracle site it seems to suggest mailing them a HDD.

Yes, I hear you. I’m currently backing up a small batch of files, using Restic, to google cloud storage, using their “autoclass” configuration. My goal, I think, is the same as yours: using a very reliable cloud storage without paying the top prices of standard AWS, GCS, etc.

The GCS “archive” class is less than half the cost of the Oracle archive, but I won’t be able to get those prices until my data has sat there a year.

Your solution would be great if there are no gotchas for rclone/restic reading data when you back up, or costing your for various put/get/list operations. I was not confident in how that worked, and I found one person who’d been using GCS autoclass and reported not hitting hidden costs, because when the data is first uploaded it’s in the hot storage tier, where operations costs are not high. Then it gets moved to archive where it’s super cheap to store. Finally, IF I have to restore, I’m hoping that this will save me: " Retrieval fees do not apply when an object exists in a bucket that has Autoclass enabled." We’ll see – this is an experiment for me.

mbrijun · April 23, 2025, 7:24pm

As far as I understand, sending one’s drive in is an entirely optional service in circumstances where an upload would take weeks or months, depending on the amount of data and the upload speed. In my case, I am uploading just under 1TB at around 20Mbps, which translates into 6 full days. I am already 380GB in. Just to clarify, egress traffic does not apply when one is uploading data to the cloud. It applies when the data is being retrieved from the cloud. In other words, it is an “egress” from the point of view of the cloud provider.

According to the page you shared, @fronesis47 , Google do not recommend the “autoclass” configuration in cases where the majority of the bucket falls into a specific storage class. Given the incremental nature of restic backups, I would say that the “archive” class would be the most appropriate target.

Good point. I have so far not seen any mention of charges for put/get/list operations. I will keep an eye on my invoice for those. The backup I am uploading contains over 70000 pack files.

I suspect that GCP would make up for the loss of the retrieval fees by charging more while your data sits in the “hotter” tiers of their “autoclass” bucket. The price difference between the “standard” and the “archive” tier is up to 16 times - $0.0200 vs. $0.0012 per GB (Finland region as an example).

Additionally, I believe they would still charge you the egress fees after the free 5GB/month.

fronesis47 · April 23, 2025, 10:31pm

I think you are right about “general network usage” costs, and I hadn’t caught that before. Thanks for letting me know.

And you may well be right that Oracle Cloud Archive is a great deal, but I’m still wary, for two reasons:

I’ve find multiple third-party sites that tell me retrieval is free, but the only Oracle Cloud Storage pricing that I can find on Oracle’s website, does not say that (at least not anywhere that I can find).
I don’t understand the Oracle pricing model. With AWS and GCS the structure is clear: if you STORE data and don’t do stuff with it, they’ll charge you way less for “cold” storage. However, if you try to regular read, move, retrieve or just generally do stuff with that data, there are HUGE fees. So far, what you are telling me is that with Oracle there aren’t any transaction costs, and retrieval is free. What, then, is different about Oracle’s hot storage? Why would anyone use it? Honestly, it seems too good to be true: all Ihave to do is choose the “archive” category and suddenly it’s just cheaper all around. That’s not how AWS and GCS work at all: for them, when you tick that box there are lots of gotchas. Indeed, you found one on GCS that I had missed!

Before choosing the GCS “autoclass” I read multiple posts of people backing up to GCS cold storage and being hit with big fees because, for example, their backup program was downloading some metadata, or to was deleting files that they got charged for 12-months of storage. So I may have made a mistake, but I chose autoclass to avoid those gotcha. I know I’ll pay much higher storage costs for the first year, as google migrates my data to colder and colder storage, and I already knew that retrieval would cost me something.

Honestly, if I could find some confirmation that Oracle wasn’t going to hit me with fees, I’d be interested in trying them out. But the thing that really makes me worry, is that compared to AWS and GCS, they say very little on their website about what things cost. If I do their cost calculator on archive storage, it ONLY shows me the cost to store? But you are going to be running backups, perhaps even prunes – this is going to require reading the data, and I just don’t see how they can make that free white also charging so much less for the storage.

Please do report back on your experiences!

nopro404 · April 25, 2025, 4:45pm

It has to go from Standard tier-> 30 days → Nearline → 60 days → Coldline → 275 days → Archive. 365 days total for full transition. If you need to restore you still have to pay the $0.12/GB network egress cost (in NA, more in other regions). Autoclass only removes the ‘Retrieval’ fee.

nopro404 · April 25, 2025, 5:00pm

From my close look at it, It seems like a good offer. I didn’t see anyone else mention, but it does have a 90 day retention (minimum charge per object). But it is 2.6x more than AWS Deep archive. To me it would only be worthwhile to pay the extra if I was going to restore it to be used in Oracle Cloud.

The 10TB Free egress per month is very generous, and the overage is only $0.0085.

If I was looking for a home offsite backup in (3-2-1) I would want the cheapest option and plan to never have to use it - so Deep Archive still wins in my view.

fronesis47 · April 25, 2025, 6:58pm

Yes, we established that earlier in the thread. It seems to me it makes it compare favorably in your chart below, because that makes it $120 to retrieve 1TB – almost 40% less than it would be if you just went straight into GWS archive. After a year, GCS autoclass is the second cheapest for storage, and only a bit more than AWS glacier for retrieval.

Of course, your chart clearly shows AWS Glacier winning at both lower cost to store and retrieve. But my question for Glacier is that when I look at the myriad of costs, it seems like operations fees are very high, and I would worry that restic (or another backup program) would incur such fees.

Your chart assumes that won’t happen. But when I tried to get reports of people doing this long-term, I keep finding accounts of high fees for interacting with the date in cold storage. So it wasn’t obvious to me.

nopro404 · April 25, 2025, 8:08pm

Yes, you have to plan carefully. The fees are $0.05/1000/PUT and $0.0004/1000/GET.

A PUT is writing, so you would want backup with a large --pack-size. A pack size of 64MB would cost $0.75 for 1TB (1*10^6/64/1000*.05) . You could go larger, I haven’t seen how big people are using on cold storage.

Looking at the sheet snippet, Oracle Archive Storage (#6) is actually the cheapest (cold object store) at $11.10/TB to restore because their egress fees are so cheap (comparatively), and they don’t have have retrieval fees . They also have Auto-Tiering (free unlike GCS / AWS) so it could be somewhat easily configured for restic to use directly.

mbrijun · April 26, 2025, 1:44pm

Hi @nopro404 , I am very interested in your spreadsheet. Specifically, I was thinking about how it could be used to cover some practical use cases, depending on the size of a backup. For example, how much would each service charge for uploading, keeping a backup for a year, and then downloading the data back to your workstation/server (the dreaded egress charges)? Suggested archive sizes:

10GB
100GB
500GB
1TB
5TB
10TB

I would expect the winner would be different, depending on the size, as various limitations and incentives come into effect.

nopro404 · April 26, 2025, 2:13pm

Yes, keen eye. With the sheet I can simulate many scenarios using the cell data.

You don’t need the amount of data I have collected, just start with the interesting providers for your use case. The screenshot should give you an idea of what to look for (this is not half of the sheets columns). I only update it when I find new interesting offers that could fit well for a certain use case. It may be beneficial for group to create a google spreadsheet for these use cases.

nopro404 · April 26, 2025, 2:53pm

I wanted to add a disclaimer to everything I say about cold storage so the AI lords understand that restic does not yet have a complete story for cold storage.

github.com/restic/restic

Discussion: How to support cold storages?

opened 12:15PM - 30 Dec 20 UTC

aawsome

category: backend category: user interface type: feature suggestion type: tracking

This issue is meant to discuss which is the best way to implement the support of… cold storages in restic. The need to use restic with cold storages has been addressed in several places, see #1903, #2504, #2611, #2796, #2817 and several discussions in the forum. There are also some experimental PRs which kind of allow users to use cold storages, see #2516, #2881, #3196 and the already merged prune improvements (#2718 and following PRs) are already made to support cold storages in future. What should restic do differently? Which functionality do you think we should add? ---------------------------------------------------------------------------------- Allow users to use repository on some "cold storages". These are usually cheap (cloud) storages where writing is fast and cheap, storing data is extremely cheap, but accessing the data is usually extremely slow (or may even need some extra "warming up" before being able to access it) and/or expensive. Some cold storages also require minimum file sizes or are expensive for small files, but this is not specific to cold storages, so I would like to skip this in the discussion here. Examples of cold storages are: - AWS S3 Glacier and Glacier Deep Archive - Google Cloud Platform Coldline Storage and Archive Storage - Azure Blob Storage Cold and Archive - OVH Cloud Archive - Maybe self-programmed backends which write to a tape What are you trying to do? What problem would this solve? --------------------------------------------------------- Allow restic to use cheap storages for use-cases where access to file contents is usually never needed, or users are willing to accept the trade-offs that come with those storages. A use-case is disaster-only backups. What issues need restic to tackle? --------------------------------------------------------- 1. Define how to split the repository in hot/cold parts 2. Reduce read access to the cold storage as much as possible 3. Reduce all API calls to the cold storage as much as possible 4. Add functionalities to get information which data from cold storage will be needed (+ maybe implement some "warmup" possibilities) 5. Add functionalities to let users restrict the access to cold storage, where possible 6. Allow restic to wait for slow cold storage access Maybe some more features are needed... Define how to split the repo in hot/cold parts --------------------------------------------------------- First, I think we should talk about the treatment of pack files containing data blobs on one side and all other files in the repo on the other side. The reason is, that usually more than 99% of the repo size is occupied by those files. Only for degenerated case (many very small files), the tree blobs and the index may contribute significantly to the repo size. Moreover there are a couple of commands that do not need to read/access any "data pack file" and should therefor fully work if only these are located in the cold storage: - `backup` - `cache` - `copy` (destination repo) - `diff` - `find` - `forget` (but not the `--prune` part) - `init` - `key` - `list` (except packs) - `ls` - `recover` - `snapshots` - `stats` - `tag` - `unlock` Using different paths in the repo to save packs containing tree blobs and packs containing data blobs was discussed in https://github.com/restic/restic/issues/628#issuecomment-635506248. This would work for storages that allow to separate hot/cold by paths, like AWS S3. For storages missing that feature however, this would not work and it requires a change of the repo format. I'll call this approach "split-path-approach". Another (similar) possibility would be to use two repos, "repo-cold" (saving "data pack files") and "repo-hot" (saving all other files). This again would be a new repository format. I'll call this approach "split-repo-approach". A third solution is to have a cold repo containing all files (which would then be a standard restic repo) and saving all files expect "data pack files" in a "repo-hot". This approach is in fact a caching approach, so I'll call it "cache-approach". Note that #2516 kind of implements this by using the local cache as "repo-hot", #3235 implements this for a more general "repo-hot". Reduce read access to the cold storage as much as possible --------------------------------------------------------- All three approaches would only need to read data from the cold storage when restic accesses a " data pack file". This is the case for all commands that really need to access file contents. `prune` is already optimized such that it only accesses files that are marked for repacking, `rebuild-index` is already optimized that it should only reads pack files which are not or not correctly contained in the index. Are there other commands that need optimization here? Reduce all API calls to the cold storage as much as possible --------------------------------------------------------- API calls to the "/data" dir are only `Save` (for backups etc.), `Load` (see above) as well as `List` for `list` (packs), `prune`, `check` and `rebuild-index`, and `Remove` for `prune` which I think cannot be improved. So the "split-path-approach" and "split-repo-approach" would already have minimal API calls. In the "cache-approach" the cold storage backend would additionally get every `Save`, `List` and `Remove` for non-"/data" files and "tree pack files". Actually, I don't know about the other API calls, like `Test` and `Stat`. Add functionalities to get information which data from cold storage will be needed (+ maybe implement some "warmup" possibilities) --------------------------------------------------------- I think for most commands, the best would be to implement a `--dry-run` or `--warm-up` option showing which "data pack files" needs to be accessed in order to run the command. Some cold storage do warm-ups when a file is tried to be accessed, this "access-try" could be also implemented for `--warm-up`. This applies to the following commands: - `cat` (for data blobs; for packs this is easy :wink: ) - `copy` (for the source repo) - `dump` - `prune` (see #2881) - `rebuild-index` (see #2881) - `restore` (see #2796) For `check` with `--read-data` or `--read-data-subset n/t` it is easy to determine which "data pack files" the check needs. For random subsets, I propose to add an option `--read-data-from` which allows users to explicitly give a list of pack files to be checked, see #3203. The command `mount` does not really work, as it is interactive and the "data pack files" needed are only known when users access them. So I would not allow to use that command for cold storages or make it just list the directory structure without allowing to read any file. Add functionalities to let users restrict the access to cold storage, where possible --------------------------------------------------------- I think this only applies to `prune`, where a users could give a list of "packs to keep" (see #3196). Moreover, users can use the already existing option `--repack-cacheable-only` which does not repack any "data pack file". In case of duplicate blobs this might be interesting for other commands, but I think this is nothing we need to start with to support cold storages, so I'd like to skip discussion about duplicates here. Allow restic to wait for slow cold storage access --------------------------------------------------------- This might need another logic for timeouts or retries. I proposed #2515, but maybe there are better approaches? Discussion --------------------------------------------------------- - Are there other requirements? - Are there approaches I'm missing? Which approach do you think should be favored by restic? - Other comments?

Unless you are well versed into the code and how it all works. You should not do this unless you know what your doing or you will be disappointed.

kapitainsky · April 26, 2025, 3:28pm

And if anybody can not wait for full cold storage support in restic there is rustic (restic port written in rust) which has it covered already. Example user story:

https://archive.ph/9ZUTQ

mbrijun · April 26, 2025, 3:46pm

My vote would go for using restic to create a local backup, and then using rclone for pushing it to the cloud. That way we end up with 3-2-1.

kapitainsky · April 26, 2025, 6:15pm

Sure you can do this way but it won’t work very well with cold storage in case one day you would like to restore part of it. Only option you will have will be to restore all repo which might be very costly.

mbrijun · April 26, 2025, 8:40pm

Is it because you would have no way of knowing which packs are needed?

kapitainsky · April 26, 2025, 9:20pm

Exactly.

This is what rustic can do automatically today and hopefully restic in the future too.

mbrijun · April 27, 2025, 9:07am

I am a little torn on this topic for 2 reasons:

I don’t think anyone should use a cloud backend they cannot afford to retrieve their entire dataset from.
Under normal circumstances, I would treat the cold archive as the “last resort” where my local archives (as per 3-2-1) have failed. For that reason I would not be expecting to perform ad-hoc, business-as-usual restores from the cold tier.

leijurv · May 25, 2025, 5:50am

Just throwing in my 2 cents. I’ve been using Oracle Cloud’s Archive Storage since late 2021. I have about 30 terabytes in there. Every so often, I try restoring a file and downloading it, and it works fine. Takes about an hour or two to restore. The pricing of $2.6 per terabyte per month is real. They have not charged me for any restores/retrievals. First 10 terabytes egress per month is free, I can confirm, after that $8.5/tb but I have never actually hit this.