Monitoring the freshness of backups



I’d like to monitor the freshness of my backups, i.e. make icinga red if host X has not created a backup in the last 24h.
I could use restic snapshots --json and get the information from there, but I’ve been wondering if anyone has a ready-to-use integration already?



Getting last successful backup time

No, we don’t have such a feature (yet).

Since this question came up several times in the last couple of days, what would be the requirements for such a thing? For restic, multiple clients can save backups into a single repo, and there may even be different backups (e.g. different directories) for a single client. What would be the semantics and the output of such a command?



So obnam has a nagios-last-backup-age option, which will check the age of the backup and produce a nagios compatible output saying whether the backup is OK, WARNING or CRITICAL (the option accepts two parameters to define when a backup is old enough for a warning, and when it’s critical).

For restic, I could imagine something like restic snapshot-age [--host=HOSTNAME] [--critical-age=AGE] [--warning-age=AGE] DIRECTORY. The output (for me) would be as expected by Nagios/Icinga (basically a string with human data plus a correct exit code of 0/1/2/3).

However, I am not even really sure if this belongs into restic core or whether this should live as a Perl/whatever script in a contrib area.



Whatever features are implemented should IMO be implemented in a neutral way, not specific to a certain e.g. external application. As you described it here, it’s very specific for Nagios. Better to make such a feature generic and the output formatted in a way that would make sense for and be coherent with restic overall.


Restic REST API for client

I’ve not used Icinga, so this may not be useful but I’ve handled this by backing up with a shell script that then does a curl request to my Sensu API. One of the things Sensu has is a TTL for the check, so I get an alert when the TTL expires (aka due to cron not running my backup, or my backup failing.)



I think the neutral way is already covered by restic snapshots. This is also what restilc-tools uses for it’s monitor command:



I sat down and wrote a simple check using Perl and Monitoring::Plugin:

1 Like


Sorry for bumping this old thread :wink:

For a monitor shell script I tried the following approach using the popular jq tool:

  # restic snapshots --json --path /srv/data | jq -r '.[-1]'
  "time": "2019-03-08T18:00:09.792955797Z",
  "parent": "ee047edc6dbfc50dd9b179d471f48079d4a4f0dd31bf60658766b6f9bed9e012",
  "tree": "8653335ec23446702c2f18531e9fbdb4d720b40378c0f88f7a995facb59814f9",
  "paths": [
  "hostname": "data01",
  "id": "09a799501eb2aa3407c0c85055756275508f9c087222178e33168ea9a05f5b07",
  "short_id": "09a79950"

One pitfall are the milliseconds of the iso8601 dates which are still not parseable by jq (issue):

# restic snapshots --json --path /srv/data | jq -r '.[-1].time|fromdate'
jq: error (at <stdin>:1): date "2019-03-08T18:00:09.792955797Z" does not match format "%Y-%m-%dT%H:%M:%SZ"

The following seems to work:

# restic snapshots --json --path /srv/data | jq -r '.[-1].time|strptime("%Y-%m-%dT%H:%M:%S.%Z")|mktime'

One can now use the timestamp for further checks in their shell script.

1 Like


I really like