Securing Restic Backups with S3 Backend

Slind · July 3, 2022, 6:59pm

Hi there,

we are working on migrating from borg to restic.

One thing that is nice about borg is the rsa key level access management which allows to limit keys to append-only. Another key only used in an isolated setup then has full access in order to purge old backups and run consistency checks.

Is there any way to achieve something similar with Restic and an S3 backend?

It appears that the required write permissions always come with delete as well.
While some like wasabi have time based file protection - this feature does not provide full protection given that backups are incremental.

Using S3 file versioning would prevent any purging of old backups.

bazinga · July 4, 2022, 1:38am

You need delete permission only on the locks/ folder. This thread about S3 policies may be useful.

Edit: I wasn’t aware that PutObject allows objects to be overwritten.

Slind · July 4, 2022, 10:20am

Thank you.

Everything I have found points to the fact that required write permissions automatically come with modify and delete perms for S3.
Are you saying this is not the case?

bazinga · July 4, 2022, 11:12am

Exactly. The link shows how to properly set the S3 permissions to create an append only repository.

MichaelEischer · July 5, 2022, 7:07pm

As far as I remember “PutObject” also allows you to overwrite objects. Even without being able to delete data it would be enough to overwrite essential repository files.

From what I can tell, the only real solution would be to use object locking, but that isn’t implemented on the restic side so far. (Maybe some combination of Object Versioning + Not being able to delete files might work too)

tomwaldnz · July 9, 2022, 10:22am

I know a fair bit about AWS and S3, but I’m not sure I fully understand what you’re trying to achieve. Can you restate what you’re trying to achieve at a high level? Leave out any implementation methods or ideas, just the what for now.

Slind · July 9, 2022, 11:46am

Hi Tom,

we would like to secure backups to the point where the node creating the backups could not delete, modify or overwrite any files.

With borg this is achieved by granting append-only permissions. So it can only read and write new files and nothing else.

tomwaldnz · July 9, 2022, 11:10pm

With S3 you would achieve that by:

turning on bucket versioning - when bucket versioning enabled all versions of objects are kept, deleting the object simply marks that version as deleted. It can still be retrieved.
turning on MFA delete
(optional instead of MFA delete) do not grant the user / role permission to delete / remove functions in their permissions policy. This means they can’t even mark a version of an object as deleted.

Let me know if you need a bucket policy, I’m a little busy today.

Slind · July 9, 2022, 11:28pm

So when versioning is enabled, deleting will still delete files?

tomwaldnz · July 9, 2022, 11:46pm

I updated my post above. When bucket versioning is enabled deleting a file simply marks it as deleted, it’s not actually removed. You can still restore or work with that object version.

Slind · July 9, 2022, 11:50pm

Is there a way to have a user with permissions that can still delete it?

This is how we have it with borg. The backup user can only append but there is a master user that can delete old backups based on a schedule.

tomwaldnz · July 9, 2022, 11:52pm

I updated my post above. When bucket versioning is enabled deleting a file simply marks it as deleted, it’s not actually removed.

Yes, other users can have different permissions that include delete permissions. If you’re doing that you might like to use versioning and user permissions rather than MFA delete.

There’s also management rules that can delete things on a schedule, but I’d be careful with that for a bucket with backups. I would only delete old versions a couple of years after it becomes the non-current version, and I would ensure you have thorough restore tests every six months that restore every file. It’d be safer not to use this feature.

Slind · July 10, 2022, 10:41am

Given that we have incremental backups, some files +1-year-old might still be needed while some files 1-month-old might no longer be needed.

Given that data is changing a lot in our case (TBs/day) we need to be able to delete old files and not just mark them as deleted - otherwise, we run out of space or incur big costs.

Do you see any solution to that? As I understand with Versioning enabled - restic with an Admin user won’t be able to delete those obsolete files either, no?

tomwaldnz · July 10, 2022, 7:17pm

That’s correct, with versioning enabled anything deleted is simply marked deleted. You won’t run out of space in S3.

I have two ideas that might work here. I’m not seeing Restic as the best tool for this use case.

Option One - Restic / S3

Two AWS users, with with full S3 read / write / delete access, one with read / write only
S3 set up with versioning
S3 Lifecycle rule set up to removed deleted objects after some period you think suitable. Personally I would have this as longer than the interval between restore tests
Limited users runs backups. Other user runs prune / delete type functions
Lifecycle rule removes deleted files to reduce costs

I would keep everything at either S3 IA class, but may transition objects to Glacier Instant Retrieval class after an interval where you’re pretty sure they won’t be deleted for six months. Once something goes into Glacier classes you’re charged for 3-6 months regardless of whether they’re there or deleted.

Option Two - S3 native (restic not used)

S3 bucket has versioning turned on (optional - protection against malware / attacker). If you do this ensure SCP used to prevent bucket versioning being turned off.
Limited users Upload your files to S3 directly using the AWS CLI v2 or one of the many S3 clients available. Store in IA class.
Admin user marks files delete - or if versioning is off deletes directly.
Lifecycle rule removes deleted files to reduce costs
Second lifecycle rule moves object between IA class to one of the cheaper glacier classes after an interval you define

I would probably recommend option two in your case. If you’re comfortable with bucket versioning turned off this is the cheaper option, but you don’t have that safety net.

Slind · July 10, 2022, 7:34pm

Thank you. So it looks like there is no proper solution for this with S3.

I’m afraid that Option Two won’t work given the amount of data we have. We do need incremental and chunk de-duplication which restic brings to the table.

I fear that Option One won’t work either, given that files +1 year old are still valid and required since they are part of recent incremental backups. Only restic knows which parts/files are no longer required. Also given that write allows overwriting files, this is not secure.

martinleben · July 10, 2022, 8:06pm

If “it” is securing your repo from attempts to delete or manipulate repo, content then there is a solution: restic plus rclone in append-only mode. But the computer running rclone must be very well protected, of course.

tomwaldnz · July 10, 2022, 10:23pm

Option one could work. You would set up lifecycle rules to only remove deleted files, rather than files being used.

Versioning should mean overwriting files isn’t a big issue. Say your machine doing the backup is hit by cryptoware and everything is encrypted. You the files in S3 to a given point in time and you’re ok, so long as you haven’t deleted the old versions.

S3 is as secure a file store as you will fine. If it’s not secure enough I’m not sure what you will find that’s better.

Slind · July 11, 2022, 8:57am

In this instance, we need another server that can hold all the data no?

Is restic still able to delete old data (which is no longer required) with append-only?

Slind · July 11, 2022, 8:59am

Does restic touch all the files (updating the timestamp) on every backup run which are still used/required?
How does S3 know which ones are safe to delete (outside of the backup retention cycle, given them being incremental)?

tomwaldnz · July 11, 2022, 10:03am

I don’t understand the internals of Restic well, not like others here, but my understanding is no it doesn’t touch the timestamps on each access. It definitely doesn’t in S3, it would have to rewrite them to do that. I think it hashes the blocks and puts the hash in the filename.

Why would you need another server to hold all the data?

If you really want “append only” buy a tape drive Otherwise I would suggest you think carefully about your requirements, document them, have them agreed, then bring the back to us for some ideas if you like. Once you can clearly express your requirements succinctly it’s easier to find a solution.