Wrapper scripts and error notification

I’m interested in running restic on a bunch of servers for backing up local data (to B2 or similar). I’ve found a few example wrapper scripts, but none of them have a very robust reporting mechanism (they tend to just send output to stdout and depend on system email).

I’m interested in how others deal with backup logs and how you get notified if an error has occurred. Ideally I’d like logs to be centrally collected and get Slack notifications if there is a failure that needs attention. Has anyone already done this?

We run restic in silent mode, which outputs nothing if there were no errors. If there is output, cron will email it to a central mailbox that all sysadmins receive.

Using system email means that we can rely on the retry behavior of the system-local MTA to make sure the mail gets delivered in the event of failure due to an intermittent network problem. If you want you can have the email go to a slack channel if you are a paid user.

We use Rocket Chat and have also needed to be able to react to emails by sending a message to a channel. In our case we use AWS SES to process incoming messages. SES stores them in an S3 bucket and then invokes a Lambda function. The function reads the message out of S3 and sends a message to Rocket Chat via an incoming HTTP integration. S3 lifecycle policies delete the messages after a day. Unless you expect to be processing tens of thousands of messages a month this way, the AWS usage should cost pennies a month.

I use a wrapper script to send the status and output to Sensu for monitoring. Since Sensu is a distributed monitoring solution, I can set a TTL on the backup so that not getting another success message within a certain amount of time also triggers an automatic alert.