All commands return 404 about locks

All commands return something like:

repository 53232826 opened successfully, password is correct
List(lock) returned error, retrying after 552.330144ms: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 1.080381816s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 1.31013006s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 1.582392691s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 2.340488664s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 4.506218855s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 3.221479586s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 5.608623477s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 7.649837917s: List failed, server response: 404 Not Found (404)
List(lock) returned error, retrying after 15.394871241s: List failed, server response: 404 Not Found (404)
Fatal: unable to create lock in backend: List failed, server response: 404 Not Found (404)

On the server (running rest-server), all repos are missing the locks directory. I checked a few repos, and all are in this fatal state.

Any advice?

Please always include the complete commands and their output, as well as any relevant environment variables, for the rest-server and restic processes you run. This is standard information when describing a problem like this, otherwise you’re just making other people work harder trying to help you. Please also include the versions for both involved software.

Pardon, didn’t come to mind doing that. Had a short moment of panic.

Almost 5h of debugging later, it seems that this is a fault of zfs, nfs and zfs (also with itself) once again fighting due to reasons not fully determined.

I think I’m able to recover from the state and at this point have a lesser reason to believe rest-server or restic is responsible for the original situation.

rest-server seems to be at fault here. Client can be latest Windows or Linux (0.9.6).

Running Docker image da93e5693693 of rest-server. At some point the locking issue has settled in. I could effectively fix a repo by going to the server, giving it the crypt password and running a command such as snapshots via a local: repo. After that was ran, everything seemed to work just fine again.

At this time, I have left few repos in this fatal degraded state, if there’s any information to be gained from there. I’ll need to figure out how to get the --debug output tomorrow. Though I would predict there will not be anything useful. Would need to capture this transformation as it happens.

It should be enough to manually restore the locks using mkdir or something similar, just make sure that the owner an permissions match that of the other folders in the repository.

I don’t see how rest-server should delete the locks folder. There aren’t many code lines which could delete something and these seem to refer only to files.

I think I found the fatal problem. By various software, empty directories are often just ignored and not transferred.

Since restic-server was accessing it through nfs (and me, manually as well), the mkdir’s ownership didn’t match up, as such they were probably 404 due to 403 on the filesystem.

I think the best way to go about this is to create an empty file (.keep) in locks with the restic init command.
If this sounds good, I’ll make plans for a MR.

Please be more specific in describing the problem. What does which software attempt to do and what does it see/get instead of what it expects?