Append-only mode with only sftp server (no rclone)

First of all, a big thank you to all the devs and contributors for this pretty amazing piece of software!

For various reasons, I would like to use restic in append-only mode (in case client gets compromised), with sftp only (no clone or restic server)

I have tried to run the sftp server with only the following commands whitelisted:

open,close,read,write,lstat,fstat,opendir,readdir,mkdir,realpath,stat,fsync,fsetstat,rename

And then I run the restic commands with –-no-lock (I understand I do not allow the delete permission necessary to delete the lock file).

I noticed the following:

restic snapshosts –-no-lock works with no error message

restic backup –-no-lock works, but unexpectedly complains about being unable to delete the lock file, with the following message:

processed 1 files, 0 B in 0:04

snapshot eaeff1c3 saved

Remove(<lock/53b3c4dea4>) failed: Remove /home/repo/locks/ea4b7162b7a2a139a2eca867c166f53b701: remove /home/repo/locks/ea4b7162b7a2a139a2eca867c166f53b701: permission denied

restic restore --no-lock works with no error message

On the side of the sftp server, for these 3 commands, I see a bunch of error messages, all explaining that restic is trying to use some non-whitelisted instructions.

So if I ignore the error message, it seems to all work fine.

Can someone explain to me why backup --no-lock still complain about not being able to delete the lock file?

Also, and most importantly, is this setup ok to use, or am I shooting myself in the foot in a subtle way?

Thanks!

Hi :waving_hand:

Could there be a leftover lock file in the repo (under locks folder)?

Still --no-lock should’ve ignored it :thinking:

I am not used to sftp endpoints, but my 2 cents are reminding the flaw on this kind of setups: Write privileges might mean “overwriting current data” which means invalidating existing backups, just something to keep in mind.

2 Likes

Ah yes! Good catch, there were indeed some stale locks due to my testing. After cleaning them with unlock I don’t see this error message anymore.

I agree though that it might still be a bug, and backup –-no-lock should have probably ignored it.

Yeah, I am a little uncomfortable with this setup. It all works well on the surface, but until someone with deep knowledge of the design can reassure me that it’s a valid setup, it feels a little too uncomfortable running a production setup with this :confused:

gurkan is correct: having the ability to write files over sftp includes existing files. Your packs could be clobbered by nonsense data (or empty files). This means the command whitelisting you’ve done doesn’t really accomplish anything, because an attacker with access to the server could delete your backups using this mechanism (or even encrypt them with ransomware).

I don’t think you will be able to accomplish your goal without doing this. The storage backend needs to understand several things to be truly append-only:

  • Files in most directories are named after the hash of their contents, so an uploaded file with contents that hash to something else should not be accepted.
  • Files may not be deleted, except for lock files.

I highly doubt you can make an sftp service understand this without patching it yourself.

An alternative is to do a daily “restic-aware sync” of the repository to another system behind the scenes, on the sftp server itself (or a third one). This sync process would make a temporary copy of each file, hash it, and if the hash is correct, sync it to another location for safe-keeping. This way you have two copies of the repo (which you should already have in the 3-2-1 scheme anyway), and – crucially – the extra copy cannot be destroyed by an attacker in this way as the restic-aware sync will not copy files whose contents do not match the hash.

1 Like

I interpret that option as not creating a lock but if there is one there already I feel it’s okay to at least warn about it.

@MaxBackuper may I ask why you don’t want to or can’t use rest-server? Maybe we can find a way to make it work anyway? That would solve your append-only problem and from my experience, rest-server is very easy to use and reliable.

1 Like

Afaiu it tried to remove stale(?) locks by default, which is a bit confusing if user passed --no-lock explicitly. Now thinking about it, might be good for sanitary purposes :slight_smile: Nevermind.

2 Likes

It is indeed a bit confusing for a user who is not familiar with the inner workings of restic. If it’s desirable to keep the cleaning of other locks, maybe clarifying the error message would remove the confusion while keeping the functionality. That’s a nitpick though.

For security purposes, on the backup server, there is a policy to not run any third party tool that has network access. Maybe I can make the case that the codebase is very small, and I can audit it myself. Maybe if I patch it a bit to remove some of the less improtant dependencies then the auditing surface shrinks enough that I can review all, take responsaibility for it and move it into the “in-house” category.

But to be honest, I am not familiar with golang, and would prefer to avoid having to maintain a patch across version releases.

At first glance, maybe I could patch out prometheus, cobra and miolini/datacounter. That would probably reduce the direct and indirect dependecies significantly. I will have to think about it… :upside_down_face:

1 Like

Thanks a lot @cdhowie for all the great points.

Ah right, I didn’t realize the write sftp requests included overwrite rights over existing files. I guess I could have a server side script that changes ownership and permission once the files have been added via restic backup from the clients. But I would need to use sticky bits to allow file creation in the restic repo, without allowing file deletion. I guess it could get messy fast.

This is a great idea. Let me try to clarify it, to see if I’m not missing anything:

  1. Clients use restic binary over sftp to backup to serverA

  2. This transport method is NOT append-only, so if clients get compromised at some point, the attacker can modify and delete repo files

  3. serverB everyday, over sftp, copies the repo that is on serverA to its own copy. To ensure integrity:

    3a) it checks hash of each file. if hash is different from filename, stop and alert

    3b) serverB copies any new files on serverA’s repo to its own copy

    3c) serverB does not delete any file in that copy process (we can’t know if deletion are due to happy path: pruning on serverA, or unhappy path: an attack on clients or serverA)

  4. serverB performs its own pruning, completely independent of any prune performed on serverA. That ensures the serverA → serverB copy is append-only. So this protects the copy on serverB from any attacker on clients and serverA

Do I think about this correctly?

Mostly, though I would alter some of the steps.

In step 3, I would copy each file (only those you don’t already have!) to a temporary location on serverB and perform the checks on that copy. This way you prevent the case where the checksum is correct, but the file gets modified on serverA between the check and the copy.

I would log a warning, ignore that file, and continue to other files. You still want to sync as much data as possible even if one file is damaged somehow (which may not even be due to an attack).

Correct, though note this means that serverB will likely copy duplicate data from serverA on every sync, since the set of pack files will ultimately start to diverge from each other.


In hindsight, we already have a tool that basically does this work for you: restic copy. It operates at a different conceptual level, so it will read data out of the packs on serverA and construct new packs on serverB probably with different hashes, but not duplicating any data like the pack-file copy approach above would. My suggestion would be to run restic copy on serverB using the remote repository on serverA.


You will indeed have to prune each repository separately in either approach. This does give you an opportunity to use different retention policies. Maybe you need to keep less history on serverA, but serverB can retain more.

Ah right. So maybe pruning locally is not such a good idea. If pruning divergence mean that eventually when I rsync serverA to serverB I essentially send again the whole repo, that seems pretty wasteful.

Hmm, ideally I need serverB to stay only with basic tooling. It feels like I should be able to achieve the replication goal with rsync only.

Thinking about your comments, what about this workflow:

  1. Clients use restic binary over sftp to backup to serverA

  2. ServerB connects to serverA via sftp and runs rsync –delete –dry-run

  3. It checks the list of files to be sent over.

    3a) If any file would be modified => we know there has been corruption or a compromise. Stop, alert and manually investigate (this is the equivalent of why you suggested earlier about checking the hash against filename)

    3b) If any file would be deleted:

  • 3bi) If we are in the special period when we know serverB has just been pruned, then we just accept the deletion
  • 3bii) If we are not during this special period, we also know there has been corruption or compromise. Stop, alert and manually investigate.

Is that reasonable, or am I missing something?

If steps 3a and 3b are successful, an attacker could still have modified the repository between the time you ran rsync --dry-run and the time you actually perform the sync.

Ideally, you need to operate on a snapshot of the repository that the attacker cannot touch, which probably means syncing the entire repository to a temporary location and then performing your checks on that copy. This means the entire repository has to be copied every time, unless you write your own tool that can copy only new files and then examine those.

A less stringent approach could be to use --ignore-existing in the final copy so that if the attacker does manage to modify anything between the check and the copy, any modified files don’t get copied.

You could still copy broken packs if an attacker does destroy some new packs, though this won’t modify previously-copied data. Because of this, you will probably want to run restic check after the sync anyway, so trying to avoid installing restic on serverB is ill-advised.

That all makes sense.

I can certainly run it with –-ignore-existing to avoid the race condition you describe.

I can also run restic binary on serverB, in a VM that blocks network calls entirely (the policy is no third-party binary with network access). So any local only operations like restic check is doable.

Thanks for your help, that seems like a reasonable setup.