when backing up /var, I experience the error >>Fatal: unable to save snapshot: nodes are not ordered got “online”, last “online”<< during backup. Unfortunately I cannot identify which file creates this problem. What does this error message mean? I could not find it online during my research. restic v0.14.0.
dont know how the command line and the output will help but here it is, tried to anonymize some settings and some output:
# restic -r '/backuprepository' --password-file '/keyfile' --exclude-file=/excludefile --verbose --tag manualfirst backup /
repository 6e87f144 opened (repository version 2) successfully, password is correct
created new cache in /cachedir/restic
no parent snapshot found, will read all files
load index files
start scan on [/]
start backup on [/]
scan finished in 5.235s: 85481 files, 1.401 GiB
Fatal: unable to save snapshot: nodes are not ordered got "online", last "online"
It might be that a newer version of restic has some added features, but I’m trying to stick to the debian package unless it is definitely required to update to fix the issues.
Looks like you’ve hit a bug which was fixed in a 0.15. It’s the same problem but reported differently. It’s likely you have filesystem corruption in the source data. Restic >0.14 now reports this as a non-fatal error and continues.
Hmm interesting. At least I use btrfs too on the filesystem in question. If it is a corruption in the source data, I would be glad to know where exactly. I wonder how the guy from the bug report found the problematic file.
If restic simply continues I know as much of it as I know now. And it seems that restic is still not able to create a logfile, am I right? I would prefer to know where the error is and try to solve it. Do you have any idea how to log at least unnormal states?
From what I’ve read on that thread (after realising I commented there, despite completely forgetting the bug they simply searched for the file in question. Look for a file or folder named online and you may see something strange going on there, e.g. there are two of them. You could try performing a test backup with some trial and error to locate the problematic file.
I’m not sure what you mean by “not able to create a logfile”. With 0.14 restic considered this a fatal error, which aborts the backup (hence no more log entries). Since 0.15 it’s been downgraded to a warning, which does not interrupt the backup. It causes exit code 3, the same as other errors where restic cannot read source data.
This is how I found out that it is located in /var filesystem. But the backup process is so fast that I could not figure out further file location. It is pretty much at the beginning of the backup, that’s the only thing I could find. And after restic aborts it does not tell where it stopped but the progress bar vanishes and it returns its statistics instead, not indicating anything wrong except the Fatal error message. Will try to invoke an offline fsck later.
With logfile I meant the earlier wish from someone to have a logfile created on what restic does, but this seems not to be fulfilled yet. At least it could write errors and maybe warnings to some log file or log daemon instead of overwriting its output and send only statistics to the console besides an error message, for which the explanation is missing in the restics docs.
Or maybe I misunderstood - did you mean I should search for a file called ‘online’? So it means a node represents something to back up? I thought it was a specific restic idiom and has nothing to do with a file on my system.
But in fact, I found two files named “online” in the same directory: It is from lxc, and I don’t use it currently, so it didn’t hurt in daily life. This is surprising issue. And in fact with restic 0.15.1 it is now overreading this file. Will continue checking what happens in the backup later, because before a ‘restic check’ resulted in the same fatal message.
Filesystem bugs create all sorts of fascinating problems. Once I had problems that resulted in many .exe’s being of the correct file size, but were actually the code from their neighbouring files. They even worked properly!
Thanks, I’ve tried to redirect stdout before posting my question with 0.14.0, but the file content was not what was expected as it didn’t contain any file that was processed, and since I didn’t redirect stderr, the error message was also not in the log. But good to know that errors are in stderr