I’ve seen a few posts hinting at this possibility but I don’t know if there are any footguns. I’m currently running ZFS on a few machines and have snapshotting set up for local restore. Now I want to set up offsite backups. ZFS-based solutions are all very expensive so I’d like to use something like B2/S3.
I know I can just back up the filesystems normally, and maybe I should, but is there any benefit to backing up the /.zfs/snapshot directory? I don’t think I care that ZFS snapshots aren’t Restic snapshots. My aim is to capture the filesystem in a consistent state, then restore the snapshots into new zpools on recovery. I know snapshots can still miss writes, but my hope is that using the ZFS snapshots as the backup source will minimize the likelihood of that inconsistency.
I read that at one point this required RESTIC_FEATURES=device-id-for-hardlinks. Is that still the case, and are there any known issues with this alpha feature?
Anything else I should be aware of? I’ve been looking into z3 for this, but I’m more familiar with Restic and would rather use it if possible, and assuming there aren’t any disadvantages.
I am actually puzzled how people can run their backups (regardless if with restic or any other program) and do not use filesystem snapshots:) I use them on my laptop (APFS) and on servers (ZFS/BTRFS). No major issues with using restic for such backups.
There is problematic (how much depends on data size) restic behaviour with snapshots that as device-id changes (it is different for every snapshot mount) restic thinks that all data changed and has to plough through all of it. Of course then deduplication kicks in and only extra metadata is saved. It was not an issue for me but I can easily imagine that it can be real problem especially for big datasets.
One minor annoyance is that leading path prefix still can’t be stripped off so my /Users/kptsky directory is backed up as /snapshot/mountpoint/Users/kptsky. It is work in progress though.
BTW above made me to start using rustic (restic repo format compatible alternative written in rust) where there is no ID issue and path stripping is already implemented. So at the moment I am actually using both as too lazy to migrate it everywhere where I have it running.
As for hardlinks I do not use them at all - so not sure why RESTIC_FEATURES=device-id-for-hardlinks might be required. As I understand hard links support is not really built in into restic repo format but only achieved by using some heuristics during restore. So I consider is a bit risky to rely on.
You’re using rustic at least in some places to back up ZFS snapshots?
You’re just backing up the /.zfs/snapshot directory?
You’re not doing anything different/unusual to account for ZFS snapshots interacting with restic/rustic?
If so, thanks, knowing that is helpful. Glad I don’t need another tool. One last question: what is restoring like? Do you just restore the snapshots like normal to their correct location and ZFS detects them as snapshots? Or is there more to it?
I use both - restic and rustic. Both works. But rustic has better options IMO when working with filesystem snapshots.
No. I backup the latest ZFS snapshot content only.
Not sure what you mean by it. I use ZFS snapshots to ensure restic backup consistency.
It is not how it works. ZFS snapshots and restic snapshots are two different things. You can NOT restore ZFS snapshot. You can only restore files and directories.
If your goal is to be able to restore ZFS snapshots then indeed you need something like z3 which effectively provides ZFS replication using raw snapshots images saved to cloud. You could achieve it with restic too by sending snapshots to files you can backup as anything else. To do this right would require a bit of wrapper and some ZFS logic as most likely you want it to be incremental. The biggest problem with such approach (the same with z3) is that you can not list or restore individual files but only full snapshots.
restic knows nothing about ZFS structure including its snapshots. It is not filesystem level backup but content level only.
Not sure what you mean by it. I use ZFS snapshots to ensure restic backup consistency.
Sorry, I see how that was unclear. There’s nothing you need to do specifically for restic to indicate that it’s backing up ZFS?
So looking at what you’re doing, it seems like you’re backing up the latest ZFS snapshot only and letting Restic/Rustic handle the rest. Is that correct? When looking at the snapshot structure, it looked like it’d be pretty easy to just take a snapshot and back that up like a normal filesystem, but I wasn’t sure if there were any gotchas. Looks like there aren’t, though.
“store deviceID only for hardlinks to reduce metadata changes for example when using btrfs subvolumes. Will be removed in a future restic version after repository format 3 is available”
This is exactly what fixes the metadata duplication problem due to changing device ids. The only reason why it is not enabled by default is that it would result in storing duplicate metadata for all repositories once.
@kapitainsky That’s still not correct. This is like the fourth or fifth time I have to tell you that the device id was never used for detecting changed files. So please stop repeating it over and over again.
Hardlinks are properly identified using the device id + inode. Except in some NFS related weird corner cases. that’s 100% accurate.
As I was interested in using ZFS snapshots I have followed this (still open) issue for some time:
The root cause was identified as device ID changing between different ZFS or BTRFS snapshots… with various creative solutions suggested, like --force-device-id or --map-device-id but without final conclusion yet.
The changing device ID causes the duplicate metadata (the generated tree blobs change). But the change detection (which only exists for files, not folder) during the backup does not use the device ID.
When looking at it from the perspective of what happens during the backup, then the following happens:
the archiver traverses the backup directories and before backing up a file it checks whether certain metadata has changed (device ID is not considered here). The change detection never looks at directories.
When all files in a directory have been backed up, then the corresponding tree blob is generated. Note that the metadata of a file is always updated, even if it’s content doesn’t change. This is for example necessary to correctly handle changed file permissions. If the tree blob is exactly like that from the previous backup, then it gets deduplicated. Otherwise, for example with a changed device ID, the tree blobs won’t match and are saved again.
The feature flag just drops the device ID for all files that are not hardlinked. As a result, if there are no hardlinks then the tree blobs won’t differ any more and get properly deduplicated again.