Restic CLI change stopped backups and the importance of checking your backups regularly

tomwaldnz · March 13, 2023, 2:05am

I’m going to start by saying this is not a problem with restic, it’s that I was using restic incorrectly, plus I wasn’t monitoring my backup script output or checking the backup repository. It’s really just a heads up for others.

I discovered today that backups for a couple of my servers haven’t run successfully for about two months. Around that time I likely upgraded to Restic 0.15, which I suspect parses the command line slightly differently from the previous versions of Restic.

Previously this command worked fine.

restic --repo s3:s3.amazonaws.com/s3-bucket-name --no-scan backup /var/www

With v0.15 when you run that you get the error ‘unknown command “/var/www” for “restic”’. When you put the “backup” command at the start of the line, or even just before the “–no-scan” it works fine.

restic backup --repo s3:s3.amazonaws.com/s3-bucket-name --no-scan /var/www

The docs say to do it the second way, so it’s a PEBKAC error, but I wonder if anyone else was caught out by this. Because it’s a CLI tool and I don’t monitor the output of the script, any failures are silent.

I do restore tests every six months, but I should probably do them more regularly - even though these personal servers aren’t critical.

fd0 · March 13, 2023, 6:41am

Hey, thanks for the heads up! We’re trying to keep the command line as compatible as possible when doing releases, so what happened to you should not happen…

I doubt that this command line worked before. The --no-scan option is specific to the backup command and was only introduced in the 0.15.0 release.

Is it maybe possible that somebody saw that in the release changelog and added the option at the wrong place? I’m only suggesting this hypothesis because a) it’s likely and b) it totally could have happened to me

tomwaldnz · March 13, 2023, 7:15am

That’ll be it! I probably half read the documentation and added “no-scan” after the 0.15.0 release without testing it properly.

It’s still a good reminder to test your backups

damoclark · March 30, 2023, 1:32am

Thanks for sharing your cautionary tale. Always a good reminder to monitor backups.

I have been integrating resticprofile into my restic backup setup across a number of hosts. It’s a very good companion to restic.

Apart from providing declarative config files and scheduling, it also has a really neat feature called status-file. For each profile that performs tasks (i.e. backups, forgets, checks, prunes etc), it loads this json-based status file, and merges the results of the current action with a timestamp. Thus, the file maintains the current state of all actions performed by resticprofile in one place. This makes it really easy to monitor the file, not only to detect what the outcome was of the most recent of each action type, but also when each profile action occurred. In other words, you can also detect when backups aren’t being run at all.

I have coupled this with resticprofile’s send-finally option which sends the status file to a centralised monitoring server. Then at a fixed interval, the server scans all the status files of all my hosts and detects: a) errors or failures for profile activities; and b) activities that haven’t run for a specified period.

I’ve tested my setup and it works really well. Within 2 days it detected an edge iot device that became disconnected from the network and didn’t perform its backup. Just sharing with the community.