Any pointers on "error ... during ... archival" (during backup)?

I have had a Python script automating backups (and forgetting/pruning) which has been working on my W10 system for years now. Every 10 minutes a recurrent task runs. This involves doing “restic snapshots” to a repository on an internal drive (drive E:) and to another repository an external drive (drive F:). Drive F: only bothers with hourly/daily/weekly/monthly backups, but drive E: also records 10-minute snapshots.

I’m getting ready to move all my machines to Linux OS (Linux Mint 22.2) so I’ve been adapting this script to work in both W10 and Linux. Things now seem to be working OK on the E: drive repo (in W10) … but something is stopping the F: drive snapshots happening properly.

I can do a manual backup thus:
> restic -r "F:\Backups\restic\My documents" --verbose -p "D:\My documents\sysadmin\resources\restic\my_documents_pwd.txt" backup --tag hour "D:/My Documents"
… and it works OK.

NB when done programmatically the adding/removal of tags (“hour”, “day”, etc.) is handled after the backup completes. Snapshots with no tags eventually get forgotten/pruned.

But my recurrent script is having problems: this log message shows what parameters I’m passing to subprocess.run():

2025-12-29 16:00:06.450 - [sysadmin] - ERROR restic_backup_task [src\recurrent2\restic_backup_task.py 292]:
ERROR. restic backup.
cmd was |['D:/apps/restic/restic_0.14.0/restic', '-r', WindowsPath('f:/Backups/restic/My Documents'), '--verbose', '-p', WindowsPath('D:/My Documents/sysadmin/Resources/restic/my_documents_pwd.txt'), 'backup', '--json', '--use-fs-snapshot', WindowsPath('D:/My Documents')]|
stderr |{"message_type":"error","error":{},"during":"archival","item":"d:\\"}
stack trace   File "D:\My documents\software 

That message from stderr comes from restic.

Anyway, I’m just wondering if someone has any idea what might be causing this restic error outputting “during .. archival .. item .. d:\\”? There are a couple of posts in this forum mentioning “during .. archival” but having examined them I don’t think I can learn much.

I’m wondering whether the fact that I’m using forward slashes as path separators (rather than backslash, official separator for Windows) might have something to do with it. Previously I was using backslashes, but as part of the process of “unifying” for use in Linux and W10 I switched everything to forward slashes. In theory W10 should be able to cope with this so it seems unlikely. Also this would surely affect snapshots to the E: drive as well, but they’re working fine.

I’m wondering what “item” .. “d:\\” might mean. Could this be some anomalous blob that’s somehow got snarled up in the F: drive repository? What might I do about that?

Later

I decided to rename the directory with the restic repo on the F: drive and initiate a new one. The first (automated i.e. programmatic) “restic backup” to it appears to have worked OK: “restic snapshots” shows that this first snapshot has been given the tags “hour”, “day”, “week”, “month” and “year”, which is how it’s meant to work.

Later

… Aaagh … next time it had to do a snapshot and tag it “hour” (i.e. one hour later) … that same error occurred in the same circumstances (“restic backup”):

stderr |{"message_type":"error","error":{},"during":"archival","item":"d:\\"}
|

… however, I can also see that there are now many snapshots without tags when I run “restic snapshots” on this F: drive repo. But that error definitely occurred on “restic backup”. This error outcome will have caused the “tag management” part of my script not to be run. But the next thing to try is probably to ignore this error and carry on to the next part of the script, tag management. Still like to have some idea of what this error means …

Please provide the information you were asked for when writing the initial post.

Also, does the error go away if you remove the --use-fs-snapshot option?

Thanks

• The output of restic version.

restic version
restic 0.14.0 compiled with go1.19 on windows/amd64

• The complete commands that you ran (leading up to the problem or to reproduce the problem).

Shown in first post: I ran (in a Python script) subprocess.run() with the following parameters:

cmd was |['D:/apps/restic/restic_0.14.0/restic', '-r', WindowsPath('f:/Backups/restic/My Documents'), '--verbose', '-p', WindowsPath('D:/My Documents/sysadmin/Resources/restic/my_documents_pwd.txt'), 'backup', '--json', '--use-fs-snapshot', WindowsPath('D:/My Documents')]|

So that list “cmd” was passed as follows:
completed_process = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', timeout=timeout)

and an error condition was detected by finding that completed_process.stderr (a string) was not empty string…

• Any environment variables relevant to those commands (including their values, of course).

None relevant

• The complete output of those commands (except any repeated output when obvious it’s not needed for debugging).

Again shown in first post. In response to my “python backup” command, I got back

{"message_type":"error","error":{},"during":"archival","item":"d:\\"}

… this is a string representation of a (Python) dict (completed_process.stderr is a string) which is the stderr response from the “restic backup” command, i.e. either restic itself or possibly subprocess.run() is doing this conversion from dict to string, not me.

I believe --use-fs-snapshot makes no difference: I tested a bit yesterday. Also this causes no problems when doing “restic backup” to drive E: … only drive F: (external HD).

But am just retesting now.

Thank you :slight_smile:

I would definitely encourage you to upgrade to the latest restic version first of all - 0.14.0 is quite old by now. Can you do that?

You can either run restic.exe self-update or download the latest official release (version 0.18.1) from here: Releases · restic/restic · GitHub

1 Like

Thanks, I’ll try that.

Actually, just looking at the logs for today, it seems that the problem is now intermittent. As I say, this job runs every 10 minutes … but this problem (which causes an ERROR-level log message) is not occurring every 10 minutes.

And this in turn means that, when the job runs OK, without errors, the tag-management code then runs … leaving a snapshot tagged “hour” every 90 minutes or 2 hours or so, when I look at the output of “restic snapshots”.

I admit I haven’t understood everything you wrote 100%, but why don’t you just set the tags in the backup command? I don’t see the point of tagging the snapshots afterwards.

Fair question. When I was developing the script I wanted to achieve a situation where at any time I would have:

  • “hour”-tagged snapshots once hourly over the past 48 hours
  • “day”-tagged snapshots once daily over the past 14 days
  • “week”-tagged snapshots once weekly over the past 4 weeks
  • “month”-tagged snapshots once monthly over the past 12 months
  • unlimited numbers of once yearly “year”-tagged snapshots

To determine what the state of play is with each tag I first get the result of “restic snapshots” and examine things, based on the current UTC time, etc.

I then add new tags as appropriate to the latest tag-less snapshot … and remove tags from any tagged snapshots which are “obsolete” (i.e. I have 49 “hour” snapshots: remove “hour” from the oldest one… but it might still have a “year” tag for example).

Finally I get a list of all snapshots which now have no tags … and forget+prune them.

This may not be the most sensible way of managing things but it’s what I came up with “organically”. And up to now using “tagless” initial snapshots and managing things this way haven’t caused any issues (and I’ve no particular reason to think this “during/archival” issue relates to it).

It seems very overcomplicated honestly. You can just perform a backup run at the most frequent interval you need, and set the tags that are appropriate for the time that it runs.

Then you just use one or more forget commands with a policy to remove the snapshots you don’t want. Come to think of it, you probably don’t even need the tags if all they’re about is time.

And finally you now and then run the prune command to get rid of data that is no longer needed in the repository.

This is a separate matter from the problem you’re currently having, so you can revisit it later.

1 Like

Yes, it is a bit complicated, but nothing too hairy really.

But with your arrangements, do you have the ability to look back 5 months and see what your restic-protected location (e.g. “My Documents” and everything under it) looked like then?

If you’re only backing up every 10 minutes (or hourly) and nothing else, doesn’t that mean you either have to set a cut-off length for preservation of snapshots (say 2 weeks or 2 months), or alternatively you start to get a repository of enormous volume?

I can even look at the state of my files under “My Documents” 3 years ago if I want to … and sometimes this ability to look a long way back comes in useful. And yet my main “My Documents” repository at the moment is surprisingly small: 4 GB. So it’s practical to store a copy of it online by doing incremental updates, using rsync for example.

You can always look in a snapshot in a repo and see what the state of the files that you backed up at that time was, yes. That is the entire point of restic and how it’s designed.

You can think of a timeline, on which you create backup snapshots. These snapshots contain all the files that you included in your backup at that snapshot’s time.

As long as your repository is healthy, you can list these snapshots, filter them, look into them and get the files they contain.

The backed up data in the snapshots is deduplicated, which means that identical parts of files that were backed up in multiple snapshots will only be stored once in the repository, saving you space. There is also compression in restic, which further can save you space.

Regardless, with the timeline containing snapshots, you can forget these snapshots by providing a “policy” to the forget command. This allows you to say that you want to keep e.g. “all hourly snapshots for the last month, one snapshot per day for the last six months, one snapshot per month for the last three years”, and so on.

I haven’t read anything so far that suggests you need to tag your snapshots the way you do, or that you need to do this to be able to look into and retrieve files back in time.

You simply back up all you want, and then every now and then remove the things you don’t want to keep down the line.

1 Like

“all hourly snapshots for the last month, one snapshot per day for the last six months, one snapshot per month for the last three years”

I think I understand. (As may be apparent :slightly_smiling_face:) I never really studied the “policy” part of the manual in depth … I probably should.

Yeah, check out: Removing backup snapshots — restic 0.18.1 documentation

Then experiment with the policies by using the snapshots command to list the snapshots you have, combined with the forget command with the different policy options and most importantly the --dry-run option to make restic only show you what it would do, without actually forgetting any snapshots.

Also note the --group-by and --tag options to the forget command, which control how you filter the snapshots and thereby which snapshots the forget command acts on.

But upgrading restic first of all is the main priority, and checking if your initial problem persists. It shouldn’t have anything to do with the tags.

Thanks. In fact I’ve already now upgraded to 0.18.1: everything running smoothly.

1 Like

Do you mean that you can no longer reproduce the initial problem and error message after you upgraded to restic 0.18.1?

Oh no, I can’t conclude that yet. I just mean that the mechanism is working OK: for example, drive E:, where the problem doesn’t occur, is functioning fine, as before with 14.0.

It’ll take a few hours before I can see whether the problem persists with 18.1 on the F: drive. But it’s no doubt a good idea not to spend your precious time on this until I can conclude one way or the other.

2025-12-31 9:00 UTC…

Judging by the logs switching to 0.18.1 seems to have stopped that error occurring.

Thanks for your help with this. Now going to learn all about “policies”.

3 Likes