Fatal: unable to save snapshot: node X already present

I have a NAS device I’m trying to backup from a NFS client with Restic and I’m getting the following error:

Apr 27 01:00:32 backup-nas_1   | Running restic
Apr 27 01:00:32 backup-nas_1   | open repository
Apr 27 01:00:32 backup-nas_1   | lock repository
Apr 27 01:00:33 backup-nas_1   | load index files
Apr 27 01:00:36 backup-nas_1   | using parent snapshot 0c4473dc
Apr 27 01:00:36 backup-nas_1   | start scan on [/data/backup /data/pictures]
Apr 27 01:00:36 backup-nas_1   | start backup on [/data/backup /data/pictures]
Apr 27 01:14:59 backup-nas_1   | scan finished in 866.336s: 212436 files, 115.031 GiB
Apr 27 01:14:59 backup-nas_1   | Fatal: unable to save snapshot: node "xt_CONNMARK.h" already present

I’ve seen other posts on the forum and github issues talking about this same error, but they don’t apply in my case: this is not a Windows machine and there are no unicode characters in the file name.

I also checked the filesystem on the NAS and it’s fine. There is also only one copy of xt_CONNMARK.h on the NAS device:

$ find /volume1/xbmc/backup/mirror/data/ -name xt_CONNMARK.h -ls
23478083    4 -rw-------   1 admin    users         199 Jun  9 22:36 /volume1/xbmc/backup/mirror/data/usr/include/linux/netfilter/xt_CONNMARK.h

The weird thing is if I try the same command from the NFS client, which is the one running Restic:

root@srv1:~# find /mnt/data/backup/mirror/data/ -name xt_CONNMARK.h -ls
 23478083      4 -rw-------   1 nobody   users         199 Jun  9 22:36 /mnt/data/backup/mirror/data/usr/include/linux/netfilter/xt_CONNMARK.h
 23478083      4 -rw-------   1 nobody   users         199 Jun  9 22:36 /mnt/data/backup/mirror/data/usr/include/linux/netfilter/xt_CONNMARK.h

This totally blew my mind. The client sees two files, but they’re actually the same file (see the inode, 23478083). This has apparently been a kernel bug in the past, but the kernel I’m using is 5.4.0-117 so that issue should have been fixed 10 years ago. I haven’t managed to find newer bug like that one. Has anybody seen this before? Any help would be greatly appreciated.

Why does Restic fail completely in this case? Yes, I understand that this is a bit of filesystem weirdness and Restic rightly expects files to be unique, but why not warn the user, ignore the duplicate and move on with the rest of the snapshot?

1 Like

I guess we could turn the error into a warning and let restic return exit code 3 like in other cases where the source data couldn’t be read properly.

However, even then it would be a good idea to find out why the NFS share is listing files twice. Is there maybe a known problem with NFS on the server-side? Without looking at the NFS file listing in detail it’s rather hard to tell whether the error originates on the server or the client.

The files on the server side look fine, ls and find only see one copy of the file and fsck is happy.

From the client side, only one copy shows up with ls, but both find and restic see the same file twice. There aren’t any symlink looks that could be affecting this, and find isn’t following symlinks by default either. I don’t understand why or how to debug this further. Googling didn’t turn up anything obviously related except for that old kernel bug.

I have exactly the same issue trying to backup an SSHFS mounted folder in Linux where the source is a Google Drive “Mirrored” folder located on an encrypted APFS USB on a MacOS machine.

I am not sure whether my issue is Google Drive specific as I believe that it allows duplicate files with the same name in the same folder?

See discussion here and here.

Notwithstanding, I agree that it would be preferable for Restic not to fail outright, if at all possible?

See error below:

root@nuc:/home/user# restic backup -n -v /media/macos_usb/ --exclude-file /home/user/scripts/restic/exclude-list-restic-user.txt
open repository
repository 3db2316e opened successfully, password is correct
lock repository
load index files
no parent snapshot found, will read all files
start scan on [/media/macos_usb/]
start backup on [/media/macos_usb/]
scan finished in 75.343s: 105732 files, 494.884 GiB
Fatal: unable to save snapshot: node “AFFIDAVIT—TERMINATION LETTER AGENT.doc” already present

Okay, I simplified things and ran restic directly on the MacOS machine. No error was generated.

I think the issue related to the way that the sshfs mount presented the Google Drive files to restic, specifically, it seems that restic didn’t like the Google Drive shortcuts (created in Google when you add a shared folder My Drive).

Somehow, when you run restic directly on MacOS, it is happier.

Have to say, I love restic! Thanks

1 Like

As long as both/all nodes with the same name has identical contents, right?

Well, as we access files by name, the filesystem probably always returns the same file instance as we’re using the same filename both times. So we could also just skip the duplicate file without reading it at all. Verifying that the first and the second instance of a file is identical will break if the file is modified concurrently.

I just feel that if we have this situation where the filesystem for some reason oddly and unexpectedly presents two files with the same name and the same inode, then we can’t trust that when we ask for one and the same path, we get one and the same file. If it messes up the file listings, what says it doesn’t mess up other things too and perhaps presents two different files with the same metadata. Hence to be sure, with an untrustworthy filesystem, we should verify.

The case where a file is modified concurrently will probably happen somewhat rarely, and if it does happen we could quit with a fatal error instead of exiting with return code 3, just like now. Because this situation will only occur when 1) the filesystem is behaving weirdly (rare) and 2) the file(s) affected by #1 is modified concurrently - seems like the frequency of that happening would be rather low.

In other words, erring of the side of caution :octopus: