Looking for a little advice on the best way of handling this…
I may or may not (I don’t know) have accidentally deleted some important files.
I have good backups and can restore, but given it’s a large amount of data, didn’t know if there was an easy way to compare the local directory against the last backup and show anything that is in the backup that isn’t present now.
There haven’t been many (if any) on purpose changes since the last backup, so if there are differences, I’ll know what I need to recover.
Only thing I’ve really thought of so far is running another snapshot and doing a diff between them.
@nett I don’t know, but restic restore -–dry-run -–delete -–overwrite if-changed may produce the result you are looking for (maybe need to add some -v to get a more verbose output)
Besides this, you can use rustic to do a diff SNAP:/my/path /local/path until this functionality gets added to restic.
I like the idea, but unfortunately, that doesn’t work: restic restore -–dry-run -–delete -–overwrite if-changed -vv
Just lists all files as restored
I will check rustic soon! The comparison page looks promising, although I’m a bit unhappy with the test coverage, and it doesn’t support --files-from
I don’t use it directly in restic, I created a shell script that defines a new environment variable: RESTIC_FILES_FROM_FILE=$HOME/.config/restic/paths.list
About --files-from see my post Backup family photos and videos - #6 by alexweiss.
My personal advise is to backup paths using includes/excludes instead - at least in most cases. IMO, --files-from is only relevant for edge cases.
It looks like a primary mean for including filesystem paths in restic, for example: I want to back up some files from home, but I don’t want to include the home, as there are sensitive keys in configs, and I don’t want to exclude the keys manually, I would rather include files from a text file, and provide the list with –-files-from. Could you advise, did I miss something?
First, if you don’t trust restic to backup sensitive information, you shouldn’t use it.
And yes, of course you can use it in every supported way you want.
I was talking about best practices. It’s a bit like someone saying “Hey, why should I always remove the bullets from the gun? As long as I don’t pull the trigger, nothing’s going to happen!” (and this guy is objectively totally right, but still totally wrong from the best-practice point of view).
In your case this typically goes like this: You carefully take care of you files-from list, check if backups are running and do restore tests. Everything is fine and you after a couple of weeks/months/years you realize everything is running. You can focus on other things and just check from time to time that your backup runs did complete. After some time passes, your harddrive crashes, but you are able to successfully restore your backup And then you realize that you have forgotten to add to files-from this one recently added file/path you are now missing badly…
For backups, it usually hurts much more to miss something needed in the backup than it would hurt to have something not needed in the backup. That’s why I recommend to work with excludes instead of includes (and files-from is some kind of include).
Thank you Alex, that’s solid, I need to think about it. I’m still learning the backups, and it’s deep.
The justification for the 'white list’ backup approach is that I try using ISO Information Classifier as I describe in the Implementation - Common sense chapter.
The problem is that some metadata just freaks me out. For example, there is a /home/user/.kube dir that holds Kubernetes certificates and caches, that is a ‘magic wand’ that effectively bypasses every protection on a cluster. Moreover, tools like Terraform create a ‘state dir’, etc. etc.
I do trust restic with Confidential data, but those metadata/configs are SC.
First: “SC” as in “Super Critical”, or what? If they are super critical, how do you plan on backing then up, if not with Restic? And, more to the point, why NOT with Restic?
Second: I also totally agree with using explicit excludes (for caches, temporary files and “junk not worthy of backup”) instead of explicit includes like you seem to like.
SC - Strictly Confidential (the highest category defined in ISO): passwords, tokens, keys, certificates.
how do you plan on backing then up
That’s a very good question; short answer: I don’t. For example:
hard drive failure
$HOME/.kube dir was lost
The recovery will require logging in to the cloud provider (GCP for example), passing 2fa, requesting cluster info from the console and authorizing a new client. That is a long, difficult and pretty painful process unlocking all security gates once again. But that is much better than having a backup.
I’m not defending the point, I’m learning. That’s a viable approach to backup the entire system - that will be handy in incident investigations. But over the past decade, I did not perform any investigations on personal computers and therefore, it feels redundant.
I need to invent some safeguard to prevent SC data leaking to the backup. I’m not sure how to do that, as modern tools tend to put credentials all over the place.
@nett , you came here for advice. We have given you a recommendation which is based on what we firmly believe is best practice and the fact that Restic backups are always encrypted. I don’t know what else we can do.