How to ensure that the backups are working correctly?

Hi,

My hosting provider is using Restic. A week ago I bumped into an issue and I wanted to restore some files from the latest available backup. I almost fainted when knowing that it was dated 2021-04-16.

Dust having settled down, I asked them if they had a way to monitor that those daily backups were run properly or any way to automate the monitoring of their execution and the answer I got is “That is your responsibility to ensure that the backups are working correctly”. We’re three people here and don’t have time to log in every day on our server to make sure it is don’t properly.
What I would like is to be able to send an email to an address we’re using to monitor our server every time backup is run with some kind of warning when the last running date exceeds a day.

By the way, I had an alternate backup thru another service which saved me.

Hi @Corobori and welcome to the Restic community! :slight_smile:

Sorry to hear that you’re having such troubles with your hosting provider in that regard.
As you already described, the solution to your problems are unrelated to restic itself and more of problem on how to operate the server and its services.
But regardless something which has been discussed here many times. :slight_smile:

I would please ask you to check out the forum and existing threads and use this information.
For example I just searched: Search results for 'notification' - restic forum

If you don’t find your answer in any of these threads or other threads with a different search pattern, I would recommend searching on Google for “systemd email on unit failure” which should give you a good starting point.

It depends how much, if any, control you have over your backups and whether you can run commands, or inspect logs/exit codes from the restic runs. This may be listed in your provider’s documentation or you may need to ask them.

Ideally you want to be monitoring exit codes and emailing the status for every backup (depending on how often they backup), as well as running restic check every day or so, and restic check --read-data every so often and reporting whether they exited cleanly or reported errors.

That’s a good point I totally missed. I was under the impression that @Corobori controls all related systems and thus has control how the backup is made. Let’s see what they respond :slight_smile:

1 Like

What I (and others I know) do is monitor the snapshots/ folder in the repository and keep track of there continously being files there with recent enough timestamps (that is, the files themselves, which the remote restic client backing up has no way of controlling, so they should be good to rely on as long as the repository server’s time is good). Then of course I do regular maintenance on them to check integrity and data as well. But to know that backups are performed continously I simply check the snapshot files existence and timestamps.

Thank you for your answer. I searched the forum before posting but it appears I didn’t search using the correct keyword, I searched for “monitoring”, in your first post you mentioned searching for “notification” which seems more appropriate. I will look into it.

@ProactiveServices How much control do I have over my backups? To be honest with you when I never investigated it in detail, my mistake. When I ordered the service I was told that backup was done thru Restic: I had a look into it how to restore files or folders and it appeared fine but I didn’t ask how it was being performed. I never thought that, for whatever reason, the service stopped working. I have asked them for a way to monitor the system and the reply I got is this one “There is no option for an automated monitoring tool for a Cloud backup service” I am still a little surprised that no other client’s using my provider services had the need to monitor how their backup was performed.
What should I ask my provider in order to see how to add “monitoring exit codes and emailing the status for every backup” process you’re suggesting.

1 Like

"There is no option for an automated monitoring tool for a Cloud backup service”
…is categorically untrue. It is not a backup service if it is not being monitored. What you have is Shroedinger’s Backup: you cannot know if it succeeded or failed until you try to restore it.

By all means keep asking them for clarification but it sounds like they’re not interested. I would consider their backup non-existent and see how you can take backups yourself. Do you have SSH access to the service? Can you access your data via an API or method which restic supports, so that you can back it up and crucially, test it, yourself?

This is the first time I hear of a hosting provider that says they make backups of your data, but then goes on to say that it’s your responsibility to verify that the backups run as they should. I mean, what the hell. How are you supposed to do that, when you’re not the one actually running the backups. Seriously, switch to another hosting provider that’s even worth calling a hosting provider :pinched_fingers: There’s plenty of good options out there :slight_smile:

2 Likes

I agree with the other two here: that’s horrible business practice. Luckily the world is full of capable MSPs who get the job done. + you mentioned that you have another copy of the data somewhere else.
Time to do the spring cleaning a bit earlier than expected huh :wink:

Changing provider is something I wouldn’t like to do right now, to be honest. Moving 120+ domains from a VPS requires some planning and time that I don’t have now.

While searching on how to configure Restic notification my provider’s support wrote to “suggest configuring an S3 browser”. Let’s be honest I don’t have the knowledge to understand what it involves when I originally ordered the service I choose the “Managed” option this provider was offering me hoping that I wouldn’t have to go into this sort of issue.

1 Like

I mean to be quiet honest: the provider runs a service. If the service fails (making and completing a backup) then it’s the providers duty to ensure that they are notified of this (+ probably you the customer as well for transparency reason).
All this doesn’t help you really, but honestly I don’t think any of us will be able to help you other than to look for someone else to do this for you. This really isn’t rocket science and is trivial for them to implement within ~ 30 minutes if they wanted to.

So just to be clear: you are not able to control the invocation of restic?
As you said earlier that you can log in to the server, it now seems that you have some sort of visibility.

Yet another example that the hosting provider (or at least the one you got the reply from) has no idea what they’re talking about.

Moving 12 would be hassle enough! If you have VPS then you have enough access to the service to run commands on the system. Have you looked around the filesystem to see if the backups are visible? If you can “see” the restic repositories and can run commands then chances are you’ll be able to address this shortfall yourself.

The reply from the hosting provider makes me wonder who manages the password required to access the repository and also the credentials to access the backup data at the S3 storage. Do you have a backup of these or does the hosting provider have one? Without the repository backup it will be impossible to access any data. (It is really a horrible feeling when you realize that the password for a backup was only stored in the VM that is currently not accessible…)

1 Like