Advice on Backing Up a Drive Containing Btrfs Snapshots

Thanks everyone for any advice in advance. Sorry for the long post, but trying to provide a good understanding of exactly what I have and am interested to achieve:

I’m looking into using Restic to implement an encrypted backup of data to an untrusted off-site server. However, this data is already stored on a Btrfs drive that has historical snapshots. So I’m looking for some advice on overall backup design and settings for Restic itself considering that Restic also has its own snapshots.

For background this is what I have today:

  • NAS1 = Running RAID6 on Btrfs with snapshots
  • NAS2 = Running RAID6 on Btrfs with snapshots
  • VPS = Off-site configurable server with significant storage

Each NAS is a ReadyNAS with an x-86 processor that I believe runs Debian.

The backup process I use today is:

  • NAS1 -> NAS2 via incremental rsync

The reasoning is:

  • NAS1 with RAID 6 and Btrfs snapshots takes care of most concerns such as drive failures etc, and even file versioning problems because of the Btrfs snapshots allow a recovery to a past state. However, this leaves the setup vulnerable to complete hardware failure of NAS1.

  • The rsync to NAS2 with RAID 6 and Btrfs snapshots essentially duplicates NAS1, except the NAS1 snapshots themselves are not synced to NAS2 via rsync. However, the NAS2 has its own Btrfs snapshots, and therefore NAS2 ends up being a complete backup of NAS1, except that there is a slight versioning difference in the snapshot data due to the periodic rsync runs. This setup provides a hot spare of NAS1 as the data can be accessed on NAS2 without any significant recovery time required etc. The slight snapshot versioning difference is an acceptable compromise because I can’t think of a better convenient way to have a hot spare. This then solves the problem of complete NAS1 failure, with quick failover. However, this leaves the setup vulnerable to a disaster situation such as a fire etc. that destroys both NAS devices.

Desired next step to add disaster recovery

  • Run Restic on NAS2 to encrypt and store the data off-site on an untrusted VPS.

Questions

  • Given that the data is already has Btrfs snapshots, what will Restic actually store/backup?

    • I think it will probably behave similar to the rsync setup I have between NAS1 & NAS2, and only backup the current live version of the data unless I somehow specifically configure it to include the snapshots, but I’m not sure if this is correct, and exactly how to include/exclude the snapshots with Restic settings.
  • If there is the option to have Restic backup the Btrfs snapshots as part of the job, then:

    • Are there any consideration in the Restic settings to make sure the Restic deduplicaiton works well with the Btrfs snapshots so the data size doesn’t grow exponentially, and/or to handle the time stamps etc., and to make sure it restores as useful as posisble?
    • Is there any reason to have Restic keep any snapshots of its own, and any reason to enable/use one set of snapshots over the other. For example:
      • Is it more resource or time efficient to just rely on the already existing Btrfs snapshots within the Restic backup, and/or would it use less computing power, RAM, upload and/or download time etc.?
      • If I had backed up the Btrfs snapshots themselves with Restic, when restoring from the off-site Restic backup, would the Btrfs snapshots restore correctly to the NAS such that the NAS Btrfs file system will recognize and accept the restored snapshots as its own, and then just move forward working as normal from there with all that efficient snapshot history? Or is this not going to work on the Btrfs file system, and I might as well just restore a snapshot produced by Restic, and start anew on the NAS without any historical snapshots on the NAS itself from that point forward?

Any advice and thoughts on the overall design or the specific Restic settings and/or which folders to backup and restore and why would be greatly appreciated. I should also mention that I’m mostly a Windows user, and have very little experience in Linux so doing things in Linux is always a learning experience.

Thanks in advance!

Restic will back up whatever you give it to back up. If that’s a path in a live filesystem, or a path from a snapshot, same thing as long as it’s presented to restic as a filesystem.

I haven’t used BTRFS snapshots enough to know how you do the same thing as you can easily do with ZFS, but here’s how you’d access a snapshot taken by ZFS - it’s a simple path, nothing more.

Assuming BTRFS has an equivalent way to access your snapshots, what about just telling restic to back up the path of the new snapshot every time you have created a new snapshot? If BTRFS doesn’t have a feature such that you can access the snapshots, tough luck (and crappy FS).

Deduplication happens on a block level, so as long as you present a filesystem to restic backup, it will dedupe just fine. Nothing to worry about.

Restic will keep snapshots, every backup run you do becomes a snapshot. That’s not something you can or need or should want to avoid.

I’m getting the feeling that you’re saying you can back up the snapshots themselves (as files?). I personally wouldn’t do that. I would give restic a representation of the files as a filesystem, not some binary blob being a snapshot. The reason for this is that then I can browse and restore parts of the repository when I need to, instead of having to go through hoops to restore a snapshot on a very specific filesystem and then go there to get my data.

In summary, keep it simple. If you can access the snapshots in your BTRFS by path, back those up.

I think perhaps some of my detailed questions above relate to how I see these Btrfs snapshots on my systems, and lack of understanding of what Restic will do with them.

For example the NAS2 folder “b” to which I rsynced all the original data from NAS1 shows:

image

The folder named “nas1” here is the folder from the NAS1 that I asked rsync to send to the NAS2, and contains the last live version of the data on the NAS1 that rsync sent. Somehow the rsync process does not send the NAS1 Btrfs snapshots themselves to NAS2, which is likely due to how the Btrfs filesystem and/or rsync works. The folder “snapshot” displays all of the snapshots that NAS2 took of its own “b” folder, and this “snapshot” folder contains numerous time stamped named folders.

Each of these time stamped named folders within the top level “snapshot” folder contains two folders identical to what is contained with the “b” folder itself (i.e. “nas1” and “snapshot” again):

The “nas1” folder in this timestamped named folder contains the actual data from the NAS1 that was live as of the timestamp.

The “snapshot” folder in this timestamped named folder contains nothing.

So, if I ask Restic to backup the NAS2 “b” folder, which contains both the last live “nas1” and “snapshots” folder containing all the actual NAS2 Btrfs snapshots, then I believe that Restic is likely going to backup the whole “b” folder including all the NAS2 Btrfs snapshots. Conversely, if I ask Restic to backup the “nas1” folder only, then I doubt it will include the NAS2 Btrfs snapshots, but will backup the last live version of the data, and make it’s own snapshots of that data that can be pruned as desired.

Ideally, when I restore the Restic backup, it would restore the last live version of the data and all the NAS2 Btrfs snapshot, because that would be very convenient, but only if the Btrfs file system actually handles these restored snapshots as its own snapshots. So I am interested in asking Restic to backup the folder “b” if it makes sense. But then comes the questions I ask above about what the Btrfs file system would do with this restored data, and Restic process efficiency etc.

Feeding additional concerns on this is from to the fact that when I ask Windows to tell me how large the NAS2 Btrfs “snapshot” folder is, it tells me it is many TB. However, NAS2 itself reports that the “nas1” folder is 411GB, and all the snapshot data on the NAS2 system is only 246GB. So I obviously don’t want any situation where the NAS2 Btrfs snapshots are treated as orders of magnitude larger than they really are, such as what Windows reports, either in the Restic backup, and/or restore process.

Let’s simplify this a bit.

  1. You want to have a local (AKA an on-site) backup of NAS1. You are currently doing this by syncing, which isn’t really backup, because it doesn’t feature things like snapshots, but your NAS2 complements the sync by making snapshots of the synced data, so effectively in the end you have a copy of the data from NAS1 and you can restore that data to different points in time. This is all good.

  2. You want a to have a non-local (AKA off-site) backup of the data from NAS1. You feel confident that the on-site copy is solid because you trust rsync and you think BTRFS on NAS2 will keep that copy of the data in good health. All of this is fine.

If to solve #2 you simply use restic to make off-site backups of the nas1 folder on NAS2, at regular intervals, you will solve the problem. You will have an off-site copy of the data, and you will be able to restore the files to the points in time that you took the backups (since restic produces its own snapshot for every time you back up).

What this hasn’t solved is your request to also save a copy of the snapshots folder on NAS2, and be able to restore backups of that folder back into e.g. NAS2 in such a way that the restored data becomes BTRFS snapshots again.

My question is; Why do you need that last part? I can’t think of a reason why you’d need the BTRFS snapshots if you have restic snapshots of the same data (they’re both snapshots of the nas1 folder).

Very good question. Let me explain:

  1. If I had always had Restic running for the off-site backup in combination with the NAS1 & NAS2 rsync backup with Btrfs snapshot process described, you are 100% correct that there is no need, from a data preservation perspective, to have the off-site backup contain the NAS2 Btrfs snapshots, as Restic effectively would have the same data to provide.

    However, I now have all of these NAS2 Btrfs snapshots that allow me to roll-back files for a number of years of history. So if I just now start using Restic to backup only the “nas1” folder, I will only then have the ability to roll the system back to {Now} if an off-site Restic recovery is needed in the future.

    However, If I backup the NAS2 Btrfs snapshots, then I will have many years of roll-back history from {Now} into the past intact in the case of an off-site Restic restore in the future. (Makes me think a Btrfs snapshot to Restic snapshot converter would be interesting)

  2. The second reason to have the NAS2 Btrfs snapshots available is that end users can easily access the NAS2 Btrfs snapshots Read Only on their Windows machines via a mapped network drive (if I allow), in their normal file explorer windows. They can then search through the historical timestamped folders and find and restore specific versions of files if needed with no intervention from me.

    So in the case of an off-site Restic restore, if I only choose to backup the “nas1” folder, then that user ability is lost from that point forward. Not that the data is gone, but I think it will be significantly less efficient to have me fetch specific Restic snapshots, for them to hunt within, and then when they find that the first specified date is still not what they want, and want to have me fetch a snapshot from an earlier data etc. That is a lot less workable.

So that is why having both the data and the Btrfs snapshots to be restored seems more ideal right now. But I admit that at some point, maybe many years in the future, the value of my point 1, will diminish, but I think point 2 will remain.

Okay, thanks for explaining.

So, your snapshots can be accessed as regular files and folders, from what I understand (since you say your users can browse them over the network). If you point restic to back up both the nas1 folder and the snapshots folder, I think you’ll be fine on the backup phase. Restic will of course deduplicate the data and not transfer or store the same set of bytes multiple times.

However, the remaining question is the restore… BTRFS snapshots are a BTRFS thing. Restic doesn’t know about them or have any special feature for them. If you were to restore e.g. the nas1 folder and the snapshots folder, I would presume that you will end up with those TBs of data, because what restic sees when it backs them up, and hence also restores when you ask it to restore, is multiple folders containing the same data. It will restore that, the way it sees it when it backs up.

I guess the snapshots folder is read-only? I’d be surprised if there was a way to put data into snapshots in btrfs in a “filesystem” way (copying stuff into the snapshots folder).

I haven’t heard of ways to restore snapshots, so honestly I don’t have any good ideas on how to do that. I would probably keep the on-site backup for this, and use the off-site one as more of a “worst case scenario” backup, accepting the loss of BTRFS snapshots and history accessible that easily by users in the event that I need to restore from the off-site backup. My users wouldn’t have a problem with it, but I understand this might not suit everyone.

On a related note, I’m wondering if you can share a mounted (see restic mount) repo to your users. I’m thinking it should be possible.

Unless someone else knows a way to restore BTRFS snapshots (even with some other tool than restic), I think you’re out of luck with that. The closest you’d get is to simply mount or build a simple web GUI for restic so your users can browse either of those.

Please note that there might be ways in BTRFS to restore snapshots, I never use it so I don’t know. Check the docs for it. If e.g. it’s possible to grab a snapshot from BTRFS as a stream of data, you can perhaps feed that into restic using --stdin, and then if there’s a way to restore a snapshot in BTRFS you could use restic dump and feed the output directly into BTRFS.

Another approach could be to do something like ZFS send|receive to send snapshots from your NAS to an off-site server. Assuming BTRFS supports this in a way that’s useful to you. This would be an alternative to using restic, the idea would be to send the BTRFS snapshots off-site.

I appreciate the thoughts. This is of course complex.

I agree that it looks like backing up the “snapshot” folder might result in Terabytes of backup if Btrfs doesn’t just accept those original Btrfs snapshots as its own snapshots when they appear on back on the NAS after a Restic restore. If Btrfs doesn’t treat them like snapshots, then the NAS doesn’t have the capacity to take all those full timestamps of the “nas1” data back, so I’m not sure exactly what it would do on an attempted restore.

With regard to your comments on trying a native Btrfs solution to backup the Btrfs snapshots to another NAS, Btrfs does have commands that apparently send Btrfs blocks called:

btrfs-send and btrfs-receive

And the ReadyNas does have what I think is a poorly documented feature called ReadyDR which I believe does something along these lines.

However, I already mostly solved the NAS1 to NAS2 on-site backup which creates the NAS2 Btfrs snapshots that give a hot backup on NAS2, instead of doing something complex with something like the ReadyDR feature that I understand requires a restore action resulting in down-time etc…

The crux of my issue now is that I really need to get the data encrypted and backed up off-site to an untrusted server. I need to be in a position to not have to trust the off-site, which means sending the snapshots places natively doesn’t seem like a good solution, but Restic does seem to be built exactly for this thought process.

It’s just these remaining detailed specifics of the interaction of Restic and Btrfs that I don’t know enough about.

Perhaps some users who are using Btrfs and backing up snapshots will see this and comment on their experience to know if what I’m looking for is possible, or if I will have to compromise and just backup the “nas1” folder and lose the existing history and convenience of the Btrfs snapshots upon an off-site restore.

I’d aim for just backing it up without the BTRFS snapshots. If your users need to go back in time to find deleted files so often, perhaps you are having other problems that need to be solved.

The “other problems” making the Btrfs snapshots valuable are the people changing their minds about modifications that they made to a file X months ago, and wanting to revert :grinning:

I’m probably going to look at implementing the “nas1” only backup method at first, but am holding out for any info I can find on a more “optimal” solution that integrates the Btrfs snapshots, and/or spend some time testing the various methods when I get more time, to see if I can get it to work,