Does restic snapshot provide crash consistency?

I wanted to check whether snapshots created by restic are crash consistent or not.

Hi @shubhag and welcome to the restic community! :slight_smile:

Can you describe what you expect from something being crash consistent? What are your expectations?
Do you mean crash consistent during a backup?

Most of the backup vendors provide crash consistent snapshot.

From google:

A crash-consistent snapshot is a snapshot that captures a virtual machine's data at the same time, preserving the write order. This means that files that depend on each other are backed up at the same point in time. 

Does restic snapshot provide this guarantee?

This needs to be clarified. There’s a ton of backup software out there, and I find it hard to believe that this statement is true. I suspect you are looking at a limited type of backup software rather than all of them. Perhaps those that are commercially supported and primarily made for backing up virtualized environments?

This suggests that what you are looking at when you consider “most backup vendors” are those of software that is specifically designed to back up virtual machines, and context is a bit different than the one where you normally use restic.

As long as restic can run on the platform and OS at hand, it can of course back up virtual machine disk images as well, but the crash consistency you are asking about is a more complex matter than just backing such files up.

First of all, restic is a file level backup software. It is not designed to back up virtual machine images on a hypervisor, which for example Veeam Backup & Replication is, although it can of course do that too given the right circumstances.

In order to take a “crash consistent” copy or backup of a file (e.g. a virtual machine’s disk image, or just any regular file for that matter), the backup software has to make sure that the file doesn’t change while it’s being read. One way to do that is to use filesystem snapshots, where the operating system is asked to create a snapshot of the filesystem, and then no further changes are made to that snapshot of the filesystem, which the backup software can then read at its own pace and be sure to not see changing data. Afterwards, the snapshot is normally deleted.

If we take Veeam Backup & Replication as an example (as it is designed for backing up virtual machines), it supports this by communicating with e.g. VMware vSphere, telling it to create a snapshot of each virtual disk it wants to back up. vSphere takes the snapshot and then makes any additional changes to the disk in another file than the one the snapshot refers to. Veeam B&R then backs up that file being referenced by the snapshot, and then asks vSpere to delete the snapshot again.

Is the above crash consistent? Sure, it can be, depending on the definition of “crash consistent”. And FWIW, restic supports filesystem snapshots on Windows (using Windows’ Volume Shadow Copy Service) via the --use-fs-snapshot option to the backup command.

However, this will not guarantee that your database or other types of application specific data/files are consistent and can be used after a restore. Some applications write their data in a way that does not guarantee consistency even if you use filesystem snapshots, and in these cases you need more “application aware processing” (as Veeam B&R happens to call it). This means that (simplified) in the example of Veeam B&R, it asks specific software (for which it supports “application aware processing”) inside the virtual machine to write it’s data in a consistent way to let the filesystem snapshot contain the latest consistent write of data from the application. This prevents the situation where the filesystem snapshot might contain inconsistent data from the application. Note that Veeam B&R does not support all software in this regard, so even when there is such a feature in the backup software, you often end up with cases where you might have to implement your own support for specific software you run (be it in a virtual machine or not).

Restic does not have this “application aware” type of processing, since that is simply outside of the scope of restic as it would be a huge undertaking to support various software and coordinate them writing their data in a guaranteed to be consistent way in relation to a snapshot being created. Not to mention a lot of software might not provide a way for a user/client to ask for such action to be taken at all. Therefore, the way to deal with this potential problem is to e.g. dump your database before taking the backup of it. And this is true for a lot of other file-level backup software like restic.

Feel free to correct me if any of what I wrote is wrong, I tried to simplify it a bit and not get into too much detail and specifics about implementatins regardless of software.

3 Likes

:100:

One more small note, just to be clear: What rawtaz is saying is that Restic can absolutely backup and restore all kinds of data. You “just” have to build a framework around it, which handles all the things that need to be taken care of so a backup and restore can function safely for this kind of data.

As restic and the repo format is open source, any company could spend some of that sweet sweet VC money and do exactly that heh :wink: hell… why not open source that one too so the whole community could benefit from it.

Well if you take a VM snapshot you can restic backup that, of course. Depending on what’s happening on the machine, deduplication should work anyway. And ZFS or btrfs snapshots of file systems are available as open source as well. I think it rather comes down to deciding on your use case as @rawtaz described in detail above.

Is it worth backing up the whole VM so you can bring it back quickly in case you need it? Or do you have a recipe to reinstall your server automatically if need be? Then all you need is your data and a DB dump and it might even be a quicker restore depending on where you keep your data.

And, also as @rawtaz already said: backing up DBs in a consistent state can be tricky even with the specialest of special software because the software will not understand your setup for you.

Apologies for my ignorance on this subject. I am new to backup and looked at gcp disk backup but wanted to explore solution that is cloud agnostic. Hence, wanted to use velero along with restic.

I have one follow up question:
Let’s say there are 3 files f1, f2 and f3 present on disk.
If we take a snapshot at a specific moment (t1) using restic, I want to know if the snapshot includes the data of f1, f2, and f3 as they were at t1, ensuring that the backup represents a consistent state of the disk at that specific time. It is required that any writes after t1 are not included for all the files.

Only if they all stay unchanged until restic backup finishes.

As already mentioned in this thread if you want to have consistent backup you have to ensure that all data stays unchanged for backup duration.

Here quite good explanation and examples how to achieve it using filesystem snapshots:

Please note that by no means it is only restic limitation. It is rather general topic which applies to any backup software.

If you go filesystem snapshots path this is unfortunately area where restic is not the best choice - as for now it does not handle filesystem snapshots nicely (backups can be slow and produce tones of unnecessary metadata - and it leads to even further performance penalty with backups’ operations). It is old and known problem - e.g. discussed here Restic 0.10.0 always reports all directories "changed", adds duplicate metadata, when run on ZFS snapshots · Issue #3041 · restic/restic · GitHub. Hopefully will be fixed one day.

2 Likes

Sidenote: some apps feature some kind of “maintenance mode” for these situations that you can turn on before starting the backup and turn off afterwards. I use that to backup Nextcloud for instance.