S3 Immutable Backups for "ransomware" recovery

ecki · June 18, 2021, 3:57pm

Hello,

I know that Restic can only go so far in terms of “ransomware proof” backups (defend against corrupted uploaded objects). However I would still like to protect my historic/existing AWS S3 backups with the available methods (without a backup server).

This is mainly:

only limited permissions to delete objects. My first tests showed that I only need to allow deletion of locks for now (see below).
turn on mandatory and 2FA enforced versioning (and maybe check that certain objects are never modified, for example the blobs?)
turn on background copy to deep archive / glacier

There is also a flag which you can set on a Bucket to protect all files for at least x days. I would use that for something like 14 days, however with the lock files thats not a good thing.

So I wonder, has this been considered, could you for example specify a second bucket for the immutable parts of an upload?

Here is a S3 Policy which allows at least some basic protection, but I dont know what cleanup or other operations break with this.

My idea is to run a cleanup from a trusted machine which has more permissions in terms of deleting objects if they are old enough. It might as well do some random probing for corrupted data to detect attacks early.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowObjects",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": "arn:aws:s3:::restic-demo/repo1/*"
        },
        {
            "Sid": "AllowDeleteLocks",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::restic-demo/repo1/locks/*"
        },
        {
            "Sid": "AllowBucket",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": "arn:aws:s3:::restic-demo"
        }
    ]
}

ecki · June 18, 2021, 6:59pm

BTW1: I wonder if using locks on S3 is very reliable anyway. Since S3 has no atomic operations, especially not if versioning is turned on, would --no-lock or contacting a small locking web service “witness” be an alternative?

BTW2: can I control when and how to “pack” blobs from such limited clients? Maybe enforce collecting larger amount of local pre-pared blob (packs) to reduce re-write activity? (asuming that backing blobs will also have to delete them?)

MichaelEischer · June 19, 2021, 10:10pm

The backup command only adds new files to a repository (and deletes its lock files), so the policy should work. I think there have also been several other posts with policy examples, you might want to take a look at those.

I’m not sure I understand what you mean here. Afaict file uploads are atomic, so either the upload succeeds or it doesn’t. The more relevant question for locking are the ordering guarantees provided by the backend. By now, S3 offers strong consistency (Amazon S3 | Strong Consistency | Amazon Web Services) which should prevent locking problems.

Which problem are you trying to solve here? It sounds like it’s related to the prune command. Since restic 0.12.0 you can tell the prune command to only delete pack files which are no longer used at all, that is contain no more used blobs.

ecki · June 20, 2021, 6:54pm

Thanks for confirmation, I was wondering that (i.e. no packing). The functional tests I did do confirm that.

Yes they are, S3 gurantees this part. Your link also claims this for “list after write”, however I was under the impression there are exceptions, for example the PutObject API stated:

Amazon S3 is a distributed system. If it receives multiple write requests for the same object simultaneously, it overwrites all but the last object written.

and

When you enable versioning for a bucket, if Amazon S3 receives multiple write requests for the same object simultaneously, it stores all of the objects.

so I am not sure. Especially give the fact there is no policy (I could find) to deny overwriting of existing objects. (whioch I thought is based on the fact that when accepting the write request it does not know if it is a overwrite or not?)

Anyway, the main problem with the locking in my case would be a policy which requires object to have a certain age before they can be touched.

I noticed there is a flag to turn locking if. If I take myself care to coordinate pruning and backup-ing, would it have other disadvantages to turn locking off? The documentation was not specific what features depend on it.

tomwaldnz · June 22, 2021, 8:56am

What you’ve done looks reasonable to me. I don’t even go that far, I just turn bucket versioning on because then I can recover any version of any object - “DeleteObjectVersion” just marks it as deleted, it doesn’t actually delete it. You can use lifecycle rules to delete non-current / deleted versions of objects over a given age as well, I tend to set it at one year because I do restore tests every six months.

I don’t move objects to Glacier / Deep Archive as then a restore would be tricky. If you have a really large repository it might be worthwhile, but if you want to do a restore test you may have to restore the whole bucket which could be expensive. Because of that I transition to IA class only.

I have mentioned restore tests twice. My opinion is if you don’t test your restores you can’t trust them. I restore different files each restore test, usually something easy to validate like an image to look at, a document to view, but something with a known SHA hash would be even better.

I actually backup to a repo on my local PC, then I “aws s3 sync” that up to S3. That gives me a local copy and more flexibility, but may not be practical for large backups.

Another note that probably doesn’t apply here, but transitioning object has a cost that can sometimes outweigh storage costs. For the 5MB objects Restic produces it probably makes sense, but for CloudTrail sizes objects it’s cheaper to keep them in S3-standard tier. Obviously you only want transition objects in the “data” folder, not the indexes, snapshots, etc.

mcampbell · June 25, 2021, 2:48pm

Would using an immutable backend like Wasabi be a workaround?

ecki · June 25, 2021, 3:31pm

I havent tried it - but if it is really immutable and works, then it would be a good solution. Not for my specific case since I use AWS for the european DC and cause I already have the contracts in place, but if somebody is looking for a good solution I think it would be fine (havent heared negative).

Having said that with the Info from the other answers i would consider the backup operation to S3 also “reasonable immutable” (with no locking) Can you describe how the Wasabi prune would differ?

mcampbell · June 25, 2021, 3:56pm

Sadly, I can’t; I have just heard from friends that Wasabi and its immutability ‘works’ for them, but in other non-restic contexts. My suspicion is that it’s a copy-then-write-over-the-copy, so it could be that you have configurable number of previous copies of things to go back to if you need, or something along those lines.

I also don’t know what this does to the amount of data that is actually stored or if that explodes on you.

doscott · June 25, 2021, 4:10pm

I have used wasabi and amazon s3, and prefer wasabi for pricing. This link provides a wasabi article comparing them to S3, and claims that setting up immutability is easier (but not unique). I don’t/haven’t used it, but when I considered going with versioning, it looked to me like S3 was simpler and had more features.
https://wasabi-support.zendesk.com/hc/en-us/articles/115001684271-What-is-the-difference-between-Amazon-S3-and-Wasabi-

wasabi now have have a European data center in Amsterdam.

MichaelEischer · June 26, 2021, 7:58pm

The --no-lock option of restic is only honored by command for which it is safe to do so. The backup and prune commands do not belong to that category. Or in other words, it is not possible to disable the lock file creation for the backup and prune command (and a few others).

bartgenuit · July 20, 2021, 10:21am

To add, I’m using versioning and MFA delete on my restic bucket. That way, restic prune (or any rogue client/ransomware) can ‘delete’ any objects it no longer needs (add a delete marker), but nothing will be truly deleted unless I do so manually, secured with MFA.

ecki · July 20, 2021, 10:52am

How does this work with the lockfiles? do,you just delete those many thombstones regularly with a mfa authorized cleanup?

bartgenuit · December 14, 2021, 9:37pm

Yes, I used to clean those using MFA.

I have now moved to a slightly different setup, since the MFA delete was quite cumbersome as it is tied to the bucket owner. I have replaced it with a bucket policy that requires MFA for s3:DeleteObjectVersion, regardless of the user.

tomwaldnz · March 31, 2022, 5:27pm

The AWS S3 Glacier Instant Retrieval storage class is somewhat relevant to this thread. I do my backups locally then use “aws s3 sync” to push files up using that storage class. You can also run a lifecycle rule to transition data to that storage class, but you pay a per-object transition fee.

You would generally want your data in that class, but not your index / etc.

It would be nice to have Restic aware of AWS S3 storage classes, but I’m not sure if that Restic is ok with having provider specific features. Being able to choose different storage classes for data and metadata would be super useful and save storage costs.