We’re developing a small tool ENACrestic that is a simple qt app that does the automation of running a restic backup every x minutes and a restic forget every y backups. It runs on Ubuntu.
A concern we have is that when everything is automated, behind the scene, it happens that the user will just shutdown it’s computer (or suspend or …) when a backup or a forget is running … leaving a lock behind.
We’re thinking about scenarios to solve it transparently …
One scenario would be to restic unlock whenever we fall into this issue and try again.
unlock removes locks that are older than 30 minutes or if created by the local host, if the corresponding restic process no longer runs. Thus in most cases calling unlock should do everything you need. Just make sure that you use restic >= 0.10.0. For older restic versions, the timestamp in the lock file is not properly refreshed.
unlock --remove-all is somewhat dangerous, unless you can 100% guarantee that no other restic processes are accessing that repository.
You talk about timestamp refreshed.
Does that mean that for restic >= 0.10.0 , the timestamp in the lock represent the time when last restic internal operation has been done ?
So when timestamp is older than 30 minutes, that means more than 30 minutes of inactivity by the process that created the timestamp ?
e.g. a restic forget that runs actively for 52 minutes, the timestamp would always be some seconds old ?
As long as restic runs, it refreshes the lock file every five minutes. If a restic process fails, then the latest lock is left behind. When a lock is older than 30 minutes, then either the corresponding process has failed or the host’s clock is wrong.
The lock is absolutely essential to prevent prune and backup from running at the same time. If both command are active at the same time, this will most likely damage the snapshot created by backup and also affect further snapshots. Thus the design choice was to play it safe and just bail if there is any other lock left.
In addition, until a few days ago there were some situations in which one could end up with a lock file older than 30 minutes and a still running restic process (see Strict repository lock handling by MichaelEischer · Pull Request #3569 · restic/restic · GitHub, actually this is just a 99% fix, I’m not sure whether 100% are even possible). (restic < 0.10.0 also had a refresh bug). Now the only remaining problem is wrong clocks. That is something against which we can do little in restic (except maybe checking timestamps in the backend???).
Unfortunately, it is far too easy to end up with clocks that are off by a few hours. Ideally, restic doesn’t break a repository in that case. I plan to eventually add some sort of automatic unlock, but probably with a much longer timeout than 30 minutes.
Dual-booting operating systems is an easy way to run into this since Windows defaults to storing BIOS time in local timezone, Linux defaults to storing UTC in the BIOS, but less technical users often change the time to be “correct” without even noticing the timezone is wrong.
You can still have the time off by any unknown value easily enough.
It might be worth considering using Google’s Roughtime protocol to grab the “real” time and toss it into the lock file, and then use roughtime for deciding whether automatic unlock is an option? You could still fail to the local time if roughtime was not available so that there is no network dependency just to get restic to run.
It depends on how frequently this actually comes up, and if there is a common trigger.
Hi, sounds interesting. I am currently working on something equivalent. Please let me describe. I would like to automate my backups of the /home/user directories on all hosts to a repo on a local SAN and a Backup-Server via sftp , that uses ssh-publickey, password and OTP auhentification, hourly. I know that I could configure ssh to passwordless authentication, but OTP makes me a headache. I don´t want to deactivate it on the other hand I have no idea how to implement it into a script that is started by cronjob. If you have an idea how to solve that, please let me know. Many thanks in advance,
Uli
Has there been any progress with this?
I have backup scripts which perform a backup, and if successful, perform a forget.
I currently have these lock files:
(Could “restic list lock” be updated to show who has the lock, and when it was created?)
-rw------- 1 media media 154 Mar 27 17:55 3dab1311a2ed637a4e8f57ff5b6ae863cbde128a0056e4197f1daeb46d3074d1
-rw------- 1 media media 157 Mar 30 22:22 808bb784b69e62bffce74e25634b8738e75ea63919b0fd3298b10cd7b2a5b384
Restic forget reports:
repo already locked, waiting up to 0s for the lock
unable to create lock in backend: repository is already locked by PID 9432 on HOST by HOST\user (UID 0, GID 0)
lock was created at 2025-03-30 22:22:35 (230h5m2.50427654s ago)
I’m fairly confident a 230h old lock can be safely nuked.
If a lock refresh process was in place, and the backup refreshed the lock regularly, then a lock with an old refresh time could be safely removed.
Locks are already refreshed every 5 minutes by default. There’s also a special case that a restic operation can continue if a client has been suspended for a longer time if its lock file still exists. I wanted to wait with modifying the unlock behavior until I find time to work on the non-blocking prune (probably restic 0.20) as it will introduce further corner cases.