Error message when backup up

Hi,

when backing up /var, I experience the error >>Fatal: unable to save snapshot: nodes are not ordered got “online”, last “online”<< during backup. Unfortunately I cannot identify which file creates this problem. What does this error message mean? I could not find it online during my research. restic v0.14.0.

Please post the command line you’re using and the output from restic. Also you may wish to upgrade as there have been many improvements since 0.14, including support for compression.

Hi,

dont know how the command line and the output will help but here it is, tried to anonymize some settings and some output:

# restic -r '/backuprepository' --password-file '/keyfile' --exclude-file=/excludefile --verbose --tag manualfirst backup /
open repository
repository 6e87f144 opened (repository version 2) successfully, password is correct
created new cache in /cachedir/restic
lock repository
no parent snapshot found, will read all files
load index files
start scan on [/]
start backup on [/]
scan finished in 5.235s: 85481 files, 1.401 GiB
Fatal: unable to save snapshot: nodes are not ordered got "online", last "online"

It might be that a newer version of restic has some added features, but I’m trying to stick to the debian package unless it is definitely required to update to fix the issues.

Looks like you’ve hit a bug which was fixed in a 0.15. It’s the same problem but reported differently. It’s likely you have filesystem corruption in the source data. Restic >0.14 now reports this as a non-fatal error and continues.

Hmm interesting. At least I use btrfs too on the filesystem in question. If it is a corruption in the source data, I would be glad to know where exactly. I wonder how the guy from the bug report found the problematic file.

If restic simply continues I know as much of it as I know now. And it seems that restic is still not able to create a logfile, am I right? I would prefer to know where the error is and try to solve it. Do you have any idea how to log at least unnormal states?

From what I’ve read on that thread (after realising I commented there, despite completely forgetting the bug :slight_smile: they simply searched for the file in question. Look for a file or folder named online and you may see something strange going on there, e.g. there are two of them. You could try performing a test backup with some trial and error to locate the problematic file.

I’m not sure what you mean by “not able to create a logfile”. With 0.14 restic considered this a fatal error, which aborts the backup (hence no more log entries). Since 0.15 it’s been downgraded to a warning, which does not interrupt the backup. It causes exit code 3, the same as other errors where restic cannot read source data.

This is how I found out that it is located in /var filesystem. But the backup process is so fast that I could not figure out further file location. It is pretty much at the beginning of the backup, that’s the only thing I could find. And after restic aborts it does not tell where it stopped but the progress bar vanishes and it returns its statistics instead, not indicating anything wrong except the Fatal error message. Will try to invoke an offline fsck later.

With logfile I meant the earlier wish from someone to have a logfile created on what restic does, but this seems not to be fulfilled yet. At least it could write errors and maybe warnings to some log file or log daemon instead of overwriting its output and send only statistics to the console besides an error message, for which the explanation is missing in the restics docs.

1 Like

Or maybe I misunderstood - did you mean I should search for a file called ‘online’? So it means a node represents something to back up? I thought it was a specific restic idiom and has nothing to do with a file on my system.

But in fact, I found two files named “online” in the same directory: It is from lxc, and I don’t use it currently, so it didn’t hurt in daily life. This is surprising issue. And in fact with restic 0.15.1 it is now overreading this file. Will continue checking what happens in the backup later, because before a ‘restic check’ resulted in the same fatal message.

1 Like

Filesystem bugs create all sorts of fascinating problems. Once I had problems that resulted in many .exe’s being of the correct file size, but were actually the code from their neighbouring files. They even worked properly!

1 Like

Usually it will quote the full pathname of a file that causes it problems. When I read your first post I also didn’t realise “online” was a file name either.

Restic prints non-error text to stdout and errors to stderr, so if you want to redirect output/errors to a logfile you can use redirection. The --quiet switch reduces all output except for errors.

1 Like

Thanks, I’ve tried to redirect stdout before posting my question with 0.14.0, but the file content was not what was expected as it didn’t contain any file that was processed, and since I didn’t redirect stderr, the error message was also not in the log. But good to know that errors are in stderr :wink:

1 Like

Sorry for raising this thread from the dead - but I ran into the same problem:

Fatal: unable to save snapshot: node "online", last"online": nodes are not ordered or duplicate

@alex1 noted that he is using LXD - which I used as well. Then I switched to Incus (the fork) and just found out that there might be a problem with Incus as well.

if /var/lib/incus-lxcfs is included in the backup, the error message is triggered. If I exclude it, everything is fine and the error does not occur:

restic backup --exclude='/var/lib/incus/storage-pools' --exclude='/var/lib/incus-lxcfs'  /var/lib            
repository 4189a229 opened (version 2, compression level auto)
using parent snapshot 8ea4b972
[0:00] 100.00%  6 / 6 index files loaded

Files:         134 new,     3 changed,  3732 unmodified
Dirs:           41 new,     5 changed,  1114 unmodified
Added to the repository: 14.912 KiB (3.909 KiB stored)

processed 3869 files, 611.048 MiB in 0:01

Just FYI, in case somebody is googling this error message :wink:

Update: On some Incus instances that have been installed from the native distro sources, the folder seems to be named /var/lib/lxcfs instead of /var/lib/incus-lxcfs.

Additional details:

The entire paths /var/lib/lxcfs or /var/lib/incus-lxcfs should be excluded from the backup.

This special file system fuse.lxcfs is regenerated or updated each time the lxcfs daemon is started and does not contain any persistent configuration or user data:

% mount | grep lxcfs
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)

% rclone tree --level 4 /var/lib/lxcfs
/
├── cgroup
├── proc
│   ├── cpuinfo
│   ├── diskstats
│   ├── loadavg
│   ├── meminfo
│   ├── slabinfo
│   ├── stat
│   ├── swaps
│   └── uptime
└── sys
    └── devices
        └── system
            └── cpu