Constantly getting errors, uploading from 2 hosts

I’ve been running Restic without issue for a while, but after adding a second computer to backup to the Backblaze repo, I’ve started getting issues.

My Setup
Two Linux computers running Arch, with a systemd timer unit set to run the following commands hourly:

# Make sure restic isn't already running
if [[ -n $(pgrep 'restic' | grep 'restic backup') ]]; then
    echo 'restic is already running...' 1>&2
    exit 0
fi

source /etc/restic/.env
export RESTIC_CACHE_DIR=~/.cache/restic

set -e
set -v

restic unlock
restic backup / --exclude-file /etc/restic/excludes --tag scheduled --no-scan

One of the machines runs on the hour, while the other runs at X:30. This is in order to make sure that an exclusive lock can be created for the following command, which only runs on one of the computers:

restic forget --prune --keep-hourly 24 --keep-daily 7 --keep-weekly 4 --keep-monthly 12 --quiet
restic check --with-cache

The Issue
The second computer was only added recently, and I’m not sure if it’s the cause of the issues, but I’ve been seeing the following errors, every couple of days:
Fatal: unable to load a tree from the repository: ReadFull(<data/c06684ba17>): Key not found
Or
id 5cb5... not found in repository

Once I run a restic repair snapshots --forget this error goes away, until it returns again.

I’d appreciate any help on:

  • Figuring out what might be causing these issues and how to fix it
  • Recommendations or best practices for backing up from multiple hosts (if there is one)

Thanks!

I don’t have any good suggestions, but I would recommend making it clearn how you’re using the b2 storage, are you using the b2: method or the s3: method?

That posted, I use the s3: backend myself (only single machine though, not dual like you) and I’ve not encountered such errors.

@kakkun which restic version are you using?

Version 0.16.4.

I also realized some mistakes with my setup, including forgetting to stop a timer, which meant that two restic processes were running simultaneously from the same host.

Hopefully, this will fix the issues I’m seeing.

That still shouldn’t cause problems.

The most likely cause is that somehow the repository locking does not work as expected. (Are there any other calls to restic unlock besides the one in the first script?). prune and check require an exclusive lock that prevents concurrent operations from running.

Which command did return that error? That is normally only possible if some command runs concurrently t o prune, which should be prevent by the locking mechanism in restic.