Central Management Server

Den · December 25, 2018, 5:55pm

Is the following setup possible with restic?:

Central backup server where all clients will back up to
Each client has their own key and also the central server adds its own key to each repo
Each client backs up at specific interval
But the central server manages a weekly purge and restic check process on each of the clients’ repos, which is possible because the central server has a key

matt · December 25, 2018, 7:56pm

Yes; a restic repo can be accessed through multiple keys (but they don’t have any permission management – your server would have to do that – and keys aren’t hierarchical).

Dj0k3 · December 26, 2018, 3:18pm

Yes and it is pretty simple. Keys not having permissions makes that every key works essentially the same; there’s no master / limited access key. However, you could serve the repo to clients through a rest server with append only mode so the server is the only one who can actually delete data.

cdhowie · December 26, 2018, 7:20pm

Yes, just use the restic REST server.

Yes, this is possible. Note that the server would have to add a key to each repository. The passphrase can be the same, but the key would be different.

The server cannot enforce this or otherwise command clients to back up.¹ The clients would have to be configured to do this, for example using cron on *nixes or the task scheduler on Windows.

Yes, the server can do this since it has a key in each repository. Note that the prune/check schedule needs to be coordinated with the clients’ backup schedule, since prune and check both require an exclusive lock on the repository. If the server has the repository locked when a backup starts, the backup will fail. (Hmm, @fd0, would it be possible for restic backup to retry locking with backoff so that backups are delayed instead of aborted?)

¹ Technically, this can be done, but it’s not supported as a native restic feature. For example, you could have the backup server command a client to run a backup script over SSH.

westphaldp · December 26, 2018, 8:21pm

To deal with this, I’ve been using loops like the following:

LOCK_WAIT=1800    # total time to allow for operation to start
LOCK_DELAY=60     # delay between attempts

EXPIRE=$(( SECONDS + LOCK_WAIT ))
until [[ $EXPIRE -le $SECONDS ]] || eval restic forget $RESTIC_ARGS $RESTIC_FORGET_ARGS; do
        echo Waiting for lock.
        sleep $LOCK_DELAY
done

Granted, this would try again on any failure that caused restic to return a non-zero value and does not provide a backoff, it has working well for me.

cdhowie · December 26, 2018, 8:50pm

@westphaldp Unless I’m missing something, the eval in that script is redundant.

westphaldp · December 26, 2018, 9:29pm

@cdhowie I didn’t include what RESTIC_ARGS and RESTIC_FORGET_ARGS are defined as. They may include variables that can’t be evaluated until this point. eval is handling that.

Den · December 26, 2018, 9:50pm

Very interesting. Thanks everyone for your replies!

@westphaldp So is there a way I could modify the script to only run restic backup if you get specific error code that lock is in use?

westphaldp · December 26, 2018, 10:04pm

From a shell script, it is possible if restic returns clearly defined exit codes. I’m not sure what the state of that is. @cdhowie, do you know what restic's exit codes look like?

Otherwise, it would require processing the text output of restic to determine why it failed.

cdhowie · December 26, 2018, 11:22pm

I would guess based on this open issue that right now you can’t tell failures apart:

github.com/restic/restic

Return different exit codes for different failures

opened 11:55AM - 10 May 17 UTC

unclesamwk

category: user interface type: feature suggestion

Hey, another request error/exit code handling. Hey, nice feature is if res…tic abort answer with an error code more than 0 or 1. rsync example: 0 Success 1 Syntax or usage error 2 Protocol incompatibility 3 Errors selecting input/output files, dirs 4 Requested action not supported: an attempt was made to manipulate 64-bit files on a platform that cannot support them; or an option was specified that is supported by the client and not by the server. 5 Error starting client-server protocol 6 Daemon unable to append to log-file 10 Error in socket I/O 11 Error in file I/O 12 Error in rsync protocol data stream 13 Errors with program diagnostics 14 Error in IPC code 20 Received SIGUSR1 or SIGINT 21 Some error returned by waitpid() 22 Error allocating core memory buffers 23 Partial transfer due to error 24 Partial transfer due to vanished source files 25 The --max-delete limit stopped deletions 30 Timeout in data send/receive 35 Timeout waiting for daemon connection For scripting restic its needed. thanks greetings sam

Den · December 29, 2018, 1:55pm

@westphaldp Did you use LOCK_WAIT=1800 in your script because restic considers a lock to be stale after 30 minutes?

westphaldp · January 2, 2019, 4:24am

@Den I wasn’t aware of any lock timeout, nor have I noticed that behavior on my systems. For my timeout, 1800 seconds just seemed like a good place to start. It can vary depending on system/repository.