How to improve inclusion / exclusion?

Hi all,

after a few hours of writing inclusion/exclusions rules and reviewing their effects, I managed to get it do what I expect, hopefully.

However, that was tedious, and I expect it to be a maintenance nightmare. As soon as I will move a directory, I will have to go through tens of rules and batch rewrite them. If I forget to do so, I may well have this data missing in the next snapshots.

What would have helped me a lot, is a way to place local directives, in the form of a special file that can be put in each directory where they make sense, with a set of rules in it scoped to its directory.

Ideally, this file should have sections with identifiers like [local-HDD] , [remote-S3] etc, so that when restic parse them, it will apply the rules of the section of its current running profile as specified with a switch. There are, for instance, big files that I backup to a local HDD but never send to the remote copy, and I suppose this is a common pattern.

Having these files located alongside with the data in the filesystem structure would be very convenient, both for writing them (scoped locally) and for moving the directories around without risking to have them missing in next snapshot (a failure which has a high probability to happen some day with my current setup unfortunately).

If it is not possible already, this is a feature suggestion. :grinning:

Regards

You probably read this. I think restic can’t (yet) do what you’re looking for but I agree that having pages of in/excludes will lead to problems eventually. Complexity is our enemy.

What you could do is place .no-cloud-backup files where you don’t want them in one target and .something else where you don’t want them in the other target. Additionally, you can exclude combos of file types and sizes.

One general tip from me: you don’t do all this to make backups. You do it to be able to restore when s__t hits the fan. Imagine it does and you need to come up with a way of getting back to work with everyone running around you in circles screaming. Make sure you quickly understand where you stored what!

2 Likes

Thank you. Yes, I carefully read the doc, and did all that. --exclude-if-present is convenient because local, but limited in expressiveness, so it has to be combined with other centralized directives. Also, this is an exclusion directive, which won’t help wrt the issue of ensuring my valuable data stays in the positive filter when I reorganize files around a few months down the road.

I live in a remote area with low upload bandwidth, cherry picking what I backup remotely is a necessity. I can not just run a restic backup / and call that a day. :grin:

I think I will mature the design a bit on my side. All suggestions are very welcome. It can start as a third party external tool, that would be used to feed --files-from. Maybe it already exists by the way, if you know about such a tool, please let me know.
Basically, what I have in mind is a syntax to express inclusion/exclusion in local files which will locally scope their patterns, and apply deeper first precedence on the inclusion rule chains. If possible, a branch deeper can still be retained inside a filtered-out branch to give best chances of inclusion rules preservation when reorganizing data in the filesytem. Also multiple profiles should be possible, to address various backup policies and targets.

Regards.

As @nicnab said, too, I would keep the inclusion and exclusion simple. If my data gets lost, I basically want to have everything back. So I include my home drive and exclude files only for two reasons:

  1. files which are really not important (cache, tmp, trash)
  2. files which are large and can be downloaded or recreated

(For a full picture, I should add that I have my media files on a separate disk and I do a separate backup.)

To give some examples, here is a typical exclude file of mine on a Linux system:

# file types
*~
*.swp

# cache and trash
.cache
.local/share/Trash

# temporary locations (depends on your usage)
/home/*/Downloads
/home/*/tmp

# large and not from me
some-external-git-repo
some.iso
# there could be virtual environments or .local/lib, too

Thank you.
However as I wrote above, it is sometime not an option to upload everything. In such circumstance, finely defining what must be remote backed-up and what not, and avoiding the pitfall of losing data due to subsequent filesystem reorganization making rules obsoletes, is key.