Monitoring the freshness of backups


I’d like to monitor the freshness of my backups, i.e. make icinga red if host X has not created a backup in the last 24h.
I could use restic snapshots --json and get the information from there, but I’ve been wondering if anyone has a ready-to-use integration already?


No, we don’t have such a feature (yet).

Since this question came up several times in the last couple of days, what would be the requirements for such a thing? For restic, multiple clients can save backups into a single repo, and there may even be different backups (e.g. different directories) for a single client. What would be the semantics and the output of such a command?

So obnam has a nagios-last-backup-age option, which will check the age of the backup and produce a nagios compatible output saying whether the backup is OK, WARNING or CRITICAL (the option accepts two parameters to define when a backup is old enough for a warning, and when it’s critical).

For restic, I could imagine something like restic snapshot-age [--host=HOSTNAME] [--critical-age=AGE] [--warning-age=AGE] DIRECTORY. The output (for me) would be as expected by Nagios/Icinga (basically a string with human data plus a correct exit code of 0/1/2/3).

However, I am not even really sure if this belongs into restic core or whether this should live as a Perl/whatever script in a contrib area.

Whatever features are implemented should IMO be implemented in a neutral way, not specific to a certain e.g. external application. As you described it here, it’s very specific for Nagios. Better to make such a feature generic and the output formatted in a way that would make sense for and be coherent with restic overall.

I’ve not used Icinga, so this may not be useful but I’ve handled this by backing up with a shell script that then does a curl request to my Sensu API. One of the things Sensu has is a TTL for the check, so I get an alert when the TTL expires (aka due to cron not running my backup, or my backup failing.)

I think the neutral way is already covered by restic snapshots. This is also what restilc-tools uses for it’s monitor command:

I sat down and wrote a simple check using Perl and Monitoring::Plugin:

1 Like

Sorry for bumping this old thread :wink:

For a monitor shell script I tried the following approach using the popular jq tool:

  # restic snapshots --json --path /srv/data | jq -r '.[-1]'
  "time": "2019-03-08T18:00:09.792955797Z",
  "parent": "ee047edc6dbfc50dd9b179d471f48079d4a4f0dd31bf60658766b6f9bed9e012",
  "tree": "8653335ec23446702c2f18531e9fbdb4d720b40378c0f88f7a995facb59814f9",
  "paths": [
  "hostname": "data01",
  "id": "09a799501eb2aa3407c0c85055756275508f9c087222178e33168ea9a05f5b07",
  "short_id": "09a79950"

One pitfall are the milliseconds of the iso8601 dates which are still not parseable by jq (issue):

# restic snapshots --json --path /srv/data | jq -r '.[-1].time|fromdate'
jq: error (at <stdin>:1): date "2019-03-08T18:00:09.792955797Z" does not match format "%Y-%m-%dT%H:%M:%SZ"

The following seems to work:

# restic snapshots --json --path /srv/data | jq -r '.[-1].time|strptime("%Y-%m-%dT%H:%M:%S.%Z")|mktime'

One can now use the timestamp for further checks in their shell script.

1 Like

I really like

1 Like

What is considered to be the best practise approach for monitoring restic backups?
Calling the snapshot list on the CLI and post-processing the returned JSON response?

I just scan the snapshots/ folder for files and get the most recent timestamp. It’s good enough for me.

My backups are initiated via cron, consisting of a backup and listing of changes over the last two snapshots, and the output is mailed to me. Normally I don’t have a lot of daily changes, but the listing of changes gives me a chance to see if something was added that shouldn’t have been, and more importantly if something was deleted that shouldn’t have been.