Sudden drop in backup size?

Hi,
I’m using restic to back up family computers using the B2 target. I have some monitoring of the bucket size for each computer, and noticed that on one of the kids’ backups the size dropped from ~60GB to ~20GB with the most recent backup and prune. I’m not expecting it to be anything bad, but I’d like to explain it… any thoughts? Some detail is below.

The setup runs a backup script weekly. The script does, for each directory to be backed up:

  • backup
  • forget --keep-daily 7 --keep-weekly 8 --keep-monthly 12 --keep-yearly 10 --keep-last 7 --prune

For the last year, the used space in this bucket has been increasing steadily, from 20GB to the 60GB it was yesterday, until this morning’s backup (presumably the forget --prune) removed most of the increased space.

I have logs that go back at least a year. I can see that the latest prune seems to have done a lot more repacking and deleting than previous ones (they’ve always done some). It did remove a ~year old snapshot - but that can’t have accounted for the ~40GB by itself, since the bucket was half that size a year ago.

One thing that may be relevant - sometimes the repositories gain stale locks (this is a laptop, and it’s possible that it can be rebooted, and certainly go to sleep, during a backup). I do have a crude status page which shows locks and how old they are, and may have decided to remove an old one for this bucket - I don’t have records of that. The backup logs don’t have any errors or warnings to do with stale locks, though, and there have been regular prunes, but not this drastic.

I’d appreciate any theories or things I can check to understand what’s happened!

Thanks,

Chris

You could run check to let restic verify that the repository is intact. Do you have a log of the statistics reported by prune?

Good idea - I’ll try running check.

I do have the prune output. The latest one looked like:

18 snapshots
remove 2 snapshots:
ID        Time                 Host        Tags        Paths
------------------------------------------------------------
1c38f2a0  2022-03-26 09:15:26  oink                    /home
868eabe9  2023-01-08 17:17:35  oink                    /home
------------------------------------------------------------
2 snapshots
[0:00] 100.00%  2 / 2 files deleted
2 snapshots have been removed, running prune
loading indexes...
loading all snapshots...
finding data that is still in use for 18 snapshots
[0:07] 100.00%  18 / 18 snapshots
searching used packs...
collecting packs for deletion and repacking
[0:04] 99.77%  15916 / 15953 packs processed
to repack:        94930 blobs / 4.907 GiB
this removes      71327 blobs / 4.181 GiB
to delete:       658532 blobs / 35.205 GiB
total prune:     729859 blobs / 39.387 GiB
remaining:       542406 blobs / 27.259 GiB
unused size after prune: 1.362 GiB (5.00% of remaining size)
repacking packs
[16:57] 100.00%  1209 / 1209 packs repacked
rebuilding index
[0:29] 100.00%  6465 / 6465 packs processed
deleting obsolete index files
[0:13] 100.00%  98 / 98 files deleted
removing 9625 old packs
[28:13] 100.00%  9625 / 9625 files deleted
done

The one before that didn’t remove any snapshots, but before that:

1 snapshots
[0:00] 100.00%  1 / 1 files deleted
1 snapshots have been removed, running prune
loading indexes...
loading all snapshots...
finding data that is still in use for 18 snapshots
[0:09] 100.00%  18 / 18 snapshots
searching used packs...
collecting packs for deletion and repacking
[0:04] 99.76%  15367 / 15404 packs processed

This is the last line for that backup. Looking back through a few, they’ve been like that for a while - not doing anything past the “packs processed” line. Might that be a symptom of a stale lockfile preventing it from going further?

Thanks!

Prune won’t run at all if there’s a stale lock file. The prune logs look perfectly normal; there’s no indication that there’s something wrong with restic here. This rather looks like something which was stored in the snapshots is no longer part of the backup.

Edit:
[0:04] 99.76% 15367 / 15404 packs processed looks suspicious. It seems like restic was killed before it could finish the prune run. Did it maybe run out of memory (although the repository should be too small for that. restic should require at most a few hundred MB)?

It is possible that it was killed - the backup runs from a timer in the background, and there’s unfortunately nothing to stop the machine from being shut down (it’s a laptop).