Backup from zfs snapshot

I wont be adding restic commands as it is working just fine for regular backups, but I am searching for idea/way how to backup my use-case. By the way, this is a home server, don’t have features like disk array and second host or anything enterprise level. Even if you have no idea about topic, it might be worth reading, you might learn something new about jails and zfs, while you can skip to TLDR.

I am using FreeBSD feature called jails, a very old and mature feature (25 years in production), that you can, for the sake of understanding compare with docker but with internal file system fully exposed to a host - it is just a sub-directory (in my case zfs fs).

The jails reside within /usr/jails/jail-name (will use jail-name test in all further explanation) where the root of a jail is. So /usr/jails/test/etc, /usr/jails/test/dev, /usr/jails/usr,…

As an optimization feature, there exists so called thin jails or read-only base jails.

Instead of actually having a full FreeBSD root inside /usr/jails/stest, a separate directory /usr/jails/base is created, that has all the system files.

The read-only zfs snapshot of base is created, and zfs clone of snapshot is made on the top of it, while the missing jail specific directories are created within the clone like /usr/local,… you get the idea.

This has several huge benefits:

  • de-duplication of same system files between jails, jail size can be in range of MBs, while full system makes it in a range of GBs
  • fully protected system from changes
  • very fast new jail creation (no need for copying the whole base system)
  • isolated jail specifics for particular jail in a clone, for example, jail that handles emails, only has postfix/dovecot binaries, mails and its configuration
  • fast upgrade, all the jails are upgraded to new system version by only upgrading the base

Now, since the /usr/jails/test is a ZFS file system, which has ability to snapshot its data and as every other zfs snapshot, it is exposed trough hidden .zfs “directory” on a root of filesystem, so in this case /usr/jails/test/.zfs (actually zroot/usr/jails/test/.zfs but nvm).

The .zfs directory contains snapshots/ and its content is the exact point in time state of file system at the time we took snapshot. Since snapshots are done with copy-on-write, this action is instantaneous

Now to restic.

If I would back up /usr/jails/test with restic, this has a huge disadvantage, the base system would be transferred into backup as whole. For each jail, producing GBs of output in repository instead of MBs (100-800MB/jail vs 2-4GB/jail). Backing up of base is completely unneeded as it is fully reproducible from FreeBSD installation.

There are a few options I have:

  • meticulously tracking the changes on base on each upgrade and extending exclude list appropriately - don’t like it as it is error prone
  • stopping jail each and every time the backup is made, to only expose non base content, this takes time as visualized network stack needs to be tear down and reestablish making zero downtime setup impossible
  • creating a snapshot of /usr/jails/test, since this doesn’t affect base jail files, the snapshot contains only the content of the jail

Now, I would prefer the third option, as it is fast, reliable and only backs up the needed files, regardless what changed in the base system.

TLDR:

What I have issue with is that I would like to make restore to original file position (disaster recovery) and for this I would need to provide restic a backup path of a snapshot (/usr/jails/test/.zfs/snapshots/last_snapshot) but store it within backup as the path snapshot was made of (/usr/jails/test) for fast recovery.

How to achieve this?

If I understood correctly, you want to backup path /usr/jails/test/.zfs/snapshots/last_snapshot but have it appear as though you backed up /usr/jails/test?

In other words, you want to change what appears in the “paths” portion of the restic snapshot metadata?

If so, this unfortunately isn’t currently possible from within restic. The feature enhancement to do this is tracked in this issue: Add an option to restic backup to override the snapshot's "paths" field · Issue #2714 · restic/restic · GitHub

Others have previously worked around this before by using chroot (e.g. Creating BTRFS snapshot before runnning restic backup), but that obviously requires more effort.

One small aside:

It wouldn’t be quite this bad. Assuming you’re backing up multiple jails to the same restic repository, the base system would be identical between the jails, and would de-duplicate fully.
So you would end up with one large 2-4GB initial snapshot for the first backup of the first jail, but then only the data individual to each additional jail would be added after that.
Obviously this is still worse than not backing up the base system, but not as bad as a copy of the base system per-jail :slight_smile:

Hmm, I see. Is there some reason, why this wasnt implemented yet, havent yet checked the implementation of restic, but in theory it would just require to replace include path with replace path for everything that is backed up from same include path, possibly even using regexp. Is there some refusal from project owners due to some legit reason, or it just wasnt implemented?

Asking as I am sure I can implemented it, but wont even try if it wont be accepted into source tree for whatever architectural or other reason. Was a long term Omniback/Data Protector (now something else) developer :slight_smile:

If I remember correctly (and from skimming the comments in #2714), there was some debate about what the solution should look like, and if rewriting the paths really was the best way to solve the problem. There were also developer concerns that allowing to set arbitrary paths would be problematic, so perhaps stripping the leading paths but keeping the same relative structure might be better.

That eventually lead to a discussion on whether moving away from using paths as part of the method for uniquely identifying each snapshot dataset entirely was better, which eventually lead to this suggestion/discussion Let's name our snapshots and define them collectively as a 'backup-set' - a means of clearly identifying the 'what', with the 'where' and 'when' in our restic repositories · Issue #4026 · restic/restic · GitHub

Some discussion and an implementation for rewriting the paths can be found in this closed PR: backup: Add options --set-path and --set-path-from by aawsome · Pull Request #3200 · restic/restic · GitHub

I think it would be worth adding your use case to the discussion in #2714, primarily because if I understood correctly, your use case would not be satisfied by the “strip-prefix” approach discussed/proposed previously? (which appears to be on the roadmap for 0.18/0.19).

As to whether writing a PR/implementation is worth your time, given the blocker seemed to be deciding on the best approach, I would personally recommend getting feedback before committing any development time. Although equally I suppose it might not be too much work to fork/resurrect #3200 and rebase it, which would give you a solution you could use “right now”.

Yes, just stripping prefix wouldn’t cut it, need more of a “replace” prefix. Looking at the code for 20 minutes :zany_face: , well written and vss support was a pleasant surprise, investigating tree usage…