How to backup a server without running restic directly on it?

jimp · March 6, 2019, 10:23pm

I have searched and read a lot of threads, but I still cannot figure out how to use Restic from a central backup server that pulls in from multiple target servers. Previously, I had a backup server that pulled in files via rsync and used hard link copies for versioning. This only required rsync on the source server, but required minimal CPU time. Restic provides so much more, but running it on the source machine requires additional disk space (for the cache), CPU time, and sometimes more memory than the VM has to offer.

I’m now trying to backup VMs running in the cloud to B2 (any file cloud). I don’t have a rsync target anymore, and I really don’t want restic credentials stored on those VMs anyway. Basically I’m trying to offer backups for customer VMs without needing a large intermediate file storage that restic runs on, force them to upgrade their disk or RAM, etc.

Is there any good way to run restic like this?

A central backup server runs restic.
The customer VM is accessed by the backup server. Customer VM cannot access the backup server.
The backup server holds the restic credentials. It runs the restic comands. Data flows to and from the file cloud service to the backup server only. The restic cache is stored on the backup server (no penalty to the VM).
Customer VM doesn’t know the backup credentials. Doesn’t know the restic repo credentials or even run it.

I feel like I’m hunting for a feature where restic can speak the rsync protocol, but all examples I’m finding have restic running on the source machine itself. Is there no way around that?

cdhowie · March 6, 2019, 11:26pm

One potential solution is to use something like sshfs or rclone mount to mount a view of the remote machine on the local system, then back that view up with restic.

Dj0k3 · March 6, 2019, 11:37pm

Another possible solution would be “reverse ssh” as described here. I don’t know how effective that would be. I would go with @cdhowie answer because there is no direct way right now to do that.

jimp · March 6, 2019, 11:43pm

I haven’t tried that, but have considered it. I ran across this bug report and I thought there might be a more supported way?

Very slow backup of SSHFS filesystems – need ability to skip inode-based comparison

jimp · March 6, 2019, 11:44pm

With reverse ssh, it looks like they are still running restic on the client side, though?

jimp · March 7, 2019, 12:10am

It looks like a fuse mount with the noforget option might work. But it also sounds like that option requires more memory linearly to the number of files restic accesses over the mount.

I’ll give it a try later and report back. I expect the backup server will always have enough memory to complete the backup.

Has anyone ever tried using restic on a NFS mount? Those behave more like regular file systems. Maybe that would work better?

thedaveCA · March 7, 2019, 5:33am

I’m running restic backing up content over a NFS mount (to another NFS mount on the same NAS). It isn’t as fast as local disk, but it works well enough.

underhillian · March 7, 2019, 2:21pm

I also use restic to back up files from an NAS over an NFS mount and it works well. In my case, I’m backing up to an S3 repo to which my network connection is the bottleneck, so I don’t even notice a decrease in speed.

Dj0k3 · March 7, 2019, 6:49pm

I haven’t tried it myself. I just read about it a while ago and though could be useful in your case.

As other users said, restic works with NFS mount too. Personally, I use Gigolo and mount via sftp. It is pretty fast. 5GiB backup in 17 min. Server is 8GiB RAM, restic is using 103MiB. Of course, it will consume more time and RAM with more data, but you won’t need any extra steps in the VM’s, just setup the ssh keys and that’s it. No restic or additional processes in the VM’s.

In this particular test, the backup directory has a lot of small files and a couple of big files. Those big files are .iso files and those files were what “slowed down” restic a little bit. This is the output:

open repository
repository b9349ed1 opened successfully, password is correct

Files:        2043 new,     0 changed,     0 unmodified
Dirs:            0 new,     0 changed,     0 unmodified
Added to the repo: 5.004 GiB

processed 2043 files, 5.307 GiB in 16:59
snapshot e4bada97 saved

ludwig-gramberg · September 29, 2020, 9:27am

I’m surprised that this is not the default use-case, especially when talking about backing up servers.
Reasons are mainly:

centralized backup of multiple remote targets
security: if a public-facing host gets compromised the existing backups are safely stored in the backup-server, if however a host is pushing backups it has basically full access to the backup-server and could access backups and/or destroy them
separation of concerns the same way its a monitoring servers job to monitor many hosts its a backup-servers job to create backups, it should not depend on a single host whether backups are created or not

rawtaz · September 29, 2020, 9:50am

There are a number of ways to set up your backup infrastructure such that this isn’t a problem. You can use REST-server in append mode for your backup repositories, or you can just store your backup repositories on a server that does (preferably zero-cost) snapshots of/in its filesystem regularly, regardless of which backend you use to back up to this repository server.

Sorry it’s not the default use case that you want. I think restic was primarily built with another use case in mind when it was started, and even though many people are backing up servers with it, there’s a lot of folks backing up their single or so computer, and for those people it wouldn’t make sense to have the suggested use case be the default.

Regardless, I suppose it doesn’t matter which is the default use case, but what you can do to meet your use case (whichever one is the default). In your case you can export the data/directories you want to back up so that your restic backup server can access those and do the backups the way you want.

Regardless of which solution you use for doing centralized backups you’ll have to prepare the source hosts one way or the other, so this is nothing new. However, once you get into backing up servers where you run more complex software such as databases that need quiescing the filesystem or other types of application-aware handling, things become more complex whereby you’ll probably need some sort of agent deployed on the source host and so on. So, it’s not as straight forward with any backup software to meet the use case you talk about.

nicnab · September 29, 2020, 11:40am

I want to add to “1.” that it usually makes sense to queue multiple servers so jobs don’t overlap. Which is the main reason that I usually do have a central script on the backup server that executes ssh user@host "restic..." on each server one after the other and when done adds forgetting and pruning on weekends. For restic repo access I now always use rest-server in append-only mode and separate repos if that makes sense regarding deduplication.

cdhowie · September 30, 2020, 5:04am

I think it’s unlikely that restic will support this as a first-class feature because, frankly, it’s out of scope.

Speaking generically, a backup tool is given a source and a destination and it makes sure the source exists in the destination in some useful way. You might say that tar and cp fit this description and could be classified as backup tools. To that, I say: yes, I agree. They are (in a way) backup tools. Crude, perhaps, but workable.

Gaining access to a remote system’s content is a different task better suited for a different tool, like NFS, sftp, or rclone. You can combine these with a backup tool to achieve what you want.

I do think it might make sense to allow specifying an rclone remote as a source, only because this makes the integration substantially simpler (rclone mount is no longer necessary). Anything beyond that is making restic responsible for too much, in my opinion.

Put another way, restic is not designed to be a complete enterprise backup solution, but you could certainly use it as part of one.

nicnab · September 30, 2020, 6:40am

Also: https://en.wikipedia.org/wiki/Unix_philosophy#Do_One_Thing_and_Do_It_Well

rawtaz · September 30, 2020, 4:19pm

I took the liberty of marking my reply as the solution because from what I can tell it’s the closest to actual suggestions on how to back up a system from another system (without running restic on the to-be-backed-up system). Feel perfectly free to change this if you wish!