Ransomware protection

As suggested by @kapitainsky i’ll open a new topic here to discuss some best practise to keep a repository protected from ransomware or bad users trying to destroy everything.

Currently i’m testing a Google Cloud Storage bucket with bucket lock set to 6 months.
A locked bucket is mostly WORM, i can create new files, but after that, noone can delete or alter until the lock expiration.

Both restic and kopia doesn’t work in this mode, due to lock files that needs to be updated.
Rustic works as expected.

I did some test creating full backups and rustic is unable to forget/prune snapshots as they are locked on google side. From google console, i’m unable to delete the repository without manually removing the lock first.

But, as per @kapitainsky question in other thread: what would happen after the lock expire ? An attacker can easily delete expired files .

But:

  1. not exactly because the service account used is write-only, deletions are denied (but yes, someone could overwrite with garbage)
  2. if the 6-months-ago backup is destroyed because unlocked, newer backups are still valid (and locked), so honestly i don’t see the issue… loosing backups very very very old is not an issue for anyone.

This is not how it works. Backups are not stored in different files for every snapshot. After your lock period it is even possible that all your backups will be destroyed if files are deleted as they probably share some files 180 days old (not protected anymore). To protect backups this way you have to keep all files forever. Your way will only work if every time you create new repository:)

I wasn’t aware of this and you are right.
Even with Kopia it’s the same ?

No. With kopia all files are protected as every time full maintenance is run kopia extends lock of all active objects.

I will explain in the next post how to do this for restic.

How to create restic based backup with ransomware protection. I will use iDrive S3 as an example (I use it myself for my backups) but it can work with any remote supporting lock, versioning and storage lifecycle.

  1. Create your backup destination bucket - I enable lock for 7 days (this is how long we will have full protection)

I use compliance mode - but be careful with it. Nothing even our account root can delete such objects nor bucket until all objects expire. Only way is to terminate all account:) So experiment with short locking periods and limited amount of data first before you decide to backup all your TBs.

In addition I create life cycle rule to delete old objects versions older than 7 days - otherwise our repo will grow forever.

  1. Create keys with full read/write access for this bucket:

  1. Now I can use this bucket for my restic repo - everything should work restic backup/forget/prune/restore. Effectively no difference from normal S3 backup. What is different is that thx to lock/versioning any file deletion or change creates file version protected for 7 days (they can’t be removed until lock is released after 7 days). Even if rouge user/virus deletes all files from our bucket all is remembered in objects’ versions.

  2. Here a bit of DIY is needed unless supported by backup program. From time to time (less than locking period, ideally every time we run backup or prune) we have to run script to extend lock time of all active objects in our bucket. Its functionality in pseudo code:

  • list (using aws cli) all active objects from our bucket (leave any versions alone)
  • use aws cli to extend locking period of every object so it is always 7 days

It would be something like for S3:

aws --endpoint-url=https://url/ \
  s3api put-object-retention \
  --bucket <bucket_name> \
  --key <object_key> \
  --version-id <version_ID> \
  --retention Mode=<retention_type>,RetainUntilDate="<retain_until_date_and_time>" \
  --bypass-governance-retention

If a version object is already locked in compliance mode, we can only extend it by setting new retain until date and time that are later than the current ones. This behaviour of compliance mode objects protects us from anything/anybody to shorten last set locking period. But we can always renew/extend it.
Which is exactly what we need. This way we maintain all active objects always locked for 7 days and allow past versions to expire and being cleaned by lifecycle rule after 7 days.

I have not investigated it further but I think it can be achieved with aws batch command in one step for all active objects.

For other storage providers like Azure or GCS maybe S3 API works but if not then they all have their own CLI tools able to do exactly the same.

  1. Backup restore after virus/ransomware/rouge user attack.
    Let’s say something/somebody deleted/corrupted files in our backup.

This is where things are becoming a little bit tricky for restic as it does not have any functionality to use bucket historical data directly. But we can call rclone for the rescue.

  • create rclone remote pointing to our S3 bucket
  • mount this remote specifying date/time in the past we want to see:
    rclone mount remote:bucket /mount/point --s3-version-at "date/time in the past"

It will give us mounted (read only) bucket with content as it was at “date/time in the past”. This is thx to our bucket keeping all versions of all files for 7 days. Now we can treat it as normal local repo and do restore of all our precious data. We can use rustic to restore files directly. Or for restic (as it requires to write lock files and this mount is read only) we can transfer all to local storage.


Point 4 and 5 are implemented in kopia. Unfortunately restic does not support it yet. I think it could be possible to include it but of course it means some development.

Otherwise above method should work with any backup program. Not only restic.

4 Likes

This comes up every now and then. I personally believe, object lock will only be feasible, when extending and managing the lock it is directly implemented in restic (or rustic).

However, there is an alternative:

  1. Create the following policy and attach it to the backup user:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowEimer",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::BUCKETNAME"
    },
    {
      "Sid": "AllowObjekte",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject"
      ],
      "Resource": "arn:aws:s3:::BUCKETNAME/*"
    },
    {
      "Sid": "AllowDeleteLocks",
      "Effect": "Allow",
      "Action": "s3:DeleteObject",
      "Resource": "arn:aws:s3:::BUCKETNAME/locks/*"
    }
  ]
}
  1. Enable versioning in the bucket
  2. Create a life cycle rule which deletes all non-current versions after 30 days
  3. Create a life cycle rule which deletes expired delete markers
  4. Create another user with full access. DON’T store those credentials on your normal machines. Store them on a special trusted PC.

Step 3 is optional. See caveat below!

Now you can only append to the backups using the backup user.

However, there is one caveat: An attacker can create new versions with the same name as an existing one containing junk data. This might be annoying but not fatal, because the old version with the restic data will still be available for 30 days (if step 3 has been implemented) or forever (step 3 not implemented).

You should do regular checks of the repository to detect such modifications. (The rights of the user of step 1 are sufficient for restic check. For additional checks you might want to create another user that also has ListBucketVersions enabled and create a script to check if there are several versions with the same name. If there are any, something is wrong.)

If an attacker did create additional versions, you can still remove all of them manually except the very first version which contains the restic data. (However, I don’t know of a tool which does that for you. Either do it manually or create a script.)

If you implement step 3, you will have the benefit that the versions of the deleted lock files will disappear after 30 days. Otherwise, you will have to do that manually in a periodic fashion.

Also, you might want to run forget/prune every now and then. I do it about every 3 months from the trusted computer which has full access to the repo. Then too: if you did not implement step 3, you will have to remove the deleted versions yourself. Otherwise, they will be deleted after 30 days.

(Note that when I am saying “version” I mean a versioned object in an S3 bucket)

Personally, I use restic copy for each of my restic s3 repositories to copy their contents back to a linux box in another building every night. That way I will immediately find out if a bucket has been tampered with. I also do restic check on that occasion, check that exactly one backup has been added, etc. Having this local copy has the additional benefit that I do have physical access to the data - and it is not only somewhere in the cloud.

2 Likes

@flomp that would be overkill… a native implementation would be better and less prone to errors.
Kopia does this automatically with S3, that’s why i’m testing Kopia but Google S3 gateway doesn’t handle all api call properly (there are some bugs open google-side).

Currently i’m trying to add the native versioning to the Kopia GCS native backend, i’ve almost done everything, but there is an issue extending an existing lock and i’m start to thinking there are some issues on the official go library from google, because the same service account used with gcloud command works as expected, if used via Kopia/Go librarys, always trigger a 403 Access Denied error, but i’m not a go programmer, thus i can’t be 100% sure (only 98% :smile: )

Well, I do it for some time now and it works.

However, quite a lot of scripting is involved. (I’m using Python and boto3).
I would really like to see a built-in solution in restic - However, this is not an easy task and might need some client side state. (E.g. so that the expiration timestamp of the object lock is known and does not have to be read back on every run.)

After thinking some more about it - it might even be possible to implement that with a separate tool:

  • Configure the bucket so that it adds an initial object lock to all objects (like 7 days)
  • Create a program which scans the bucket regularly, determines which object locks shall be extended and which shall expire.
  • This program should also know, which snapshots are to be deleted so that it can expire the object lock on those files

Would be quite some effort, though

Can you help me understand why the object locks need to be extended and the backup user shouldn’t have rw access except for the lock files?

If the bucket has enabled Object Lock, Versioning, and Retention for x days, then it seems the backup user can have rw permissions. If a bad actor (running as the backup user) overwrites or deletes files, can’t we can use --s3-version-at to access the good backups (as long as it is caught with the x days).

(I’m more familiar with Minio, so maybe it is a difference in the way it and AWS S3 treat Object Locks.)

Not really. This is my step (4). This program does not have to be aware of any repo structure/snapshots etc. Its only task is to extend existing locks on all active objects.

as some files locks can expire but they still can be part of active backups. And without lock they are possible target for evil actor. With proper locks even root user can’t damage all repo.

When you delete object without lock then its version is without lock too…

How do you do forget/prune?

I run restic forget/prune. My S3 user has permissions to delete objects - it does not change anything in regards to repo protection thx to lock/versioning. If objects are “deleted” they will live for locking period as versions anyway.

If a file that is older than the retention period (say 45 days) is deleted, won’t a version be created and retained for another 45 days? Can the bad actor immediately delete that version?

Well… How your old versions are deleted? As normally your version without lock will be deleted by objects life cycle. Or by user with permissions to delete versions as it does not have lock anymore.

Proper solution protects you from pretty much anything but closing your account.

I understand. So it just adds a delete marker for those files.

Interesting idea. So you just don’t extend object lock for files that have delete markers?
Also, you are watching out for unexepected delete markers I guess…

Nope I am not:) But I think I understand why you do it. Early warning that something is trying to do something bad… However it covers only one scenario I think. If lets say my data will be encrypted by some ransomware watching for delete markers is not going to warn me…

1 Like

So I did some testing and confirmed the issue. The retention date is set when the file is created, so a file that is still in use by a snapshot can be modified/deleted without preserving a version as long as it is old enough (ie, from a much earlier snapshot).

Ideally, it seems all the active files should have their retention dates updated as part of the backup process. It would be nice if restic handled this during the backup. Any thoughts on opening a GH issue requesting this?

In the meantime, I think all active files can be updated after each backup/prune/forget. There is a Minio command that will do this (mc retention set compliance -r 45d host/buck/files/), and I think AWS S3 “Batch Operations” could be also be used for this purpose.

If the retention dates are correctly set, I don’t see any issue with the backup user having rw permissions on all the files in the bucket. Agree?

Correct. This is exactly the same results as my experimenting with IDrive S3.

I do not think that it would be right approach to run it with every backup. Some remotes like AWS Glacier incur transactions cost. Running it with every backup IMO is too often. If backup is run for example every hour there is no practical need for it at all,
Maybe during periodic prune? Or completely new command restic lock?
What I do is for bucket with 100 days lock I run lock time extension script every week. So worst case scenario I am protected for 100-7 days.

I think using your specific provider command and scheduled script is good enough solution.

Agree. If you use compliance lock than there is no need at all to restrict S3 user permissions. Even your account root user can not delete data before such bucket lock expires.

I have tested deleting all bucket content and then restoring data using past versions. Works perfectly. For this I use rclone mount --s3-version-at and rustic (as it does not have to write any lock files). It can be done with restic but either requires transferring all repo data to other writeable location or creating customised rclone mount where directories are stitched together using combine and union remotes - with locks directory residing e.g. on local filesystem.