Could it be possible to remove unchanged snapshots?

We now have --skip-if-unchanged, which prevents empty snapshots from being created, but we still have older empty snapshots that are there from before we started using it.

Could it be possible to remove these somehow? Maybe with a prune option or something?

1 Like

There’s currently no option to remove duplicate snapshots (besides telling forget which specific snapshots to remove). Feel free to open a Github issue with a feature request.

Thanks for the sign-off. Issue is now at Provide ability to remove unchanged snapshots #5157 .

2 Likes

I had the same problem. Wrote this: GitHub - oli-h/restic_forget_snapshot_dups: A smarter `forget` for `restic` backup program

2 Likes

@oli-h Looks like it’s 3 years old. Have you been using it the whole time? Does it have any known problems?

Yes, I used it all the time, i.e. for all Restic-Versions starting from 3 years ago until the Restic-Version which offers this new option --skip-if-unchanged.
Since then I didn’t need the script any more (note that all ‘duplicate’ backups in the past have already been cleaned up).

Another note: On on of my (server)-directories I do a Restic-Backup once per hour. This leads (or better: leaded) to a lot of duplicates and was the main motivation to write this scipt.

Note also that the script is “dry run” - i.e. it does not “forget” any snapshort automatically. It just prints the appropriate “restic forget” command. It also prints a whole list of all snapshots (chronologically) and it’s decission if this or that is a duplicate of the ‘previous’ one - or not.

Question. This idea of “ignore duplicates”, if using a “forget policy” of some kind, would mean that if, for example, you were “keeping the last 24 hourly snapshots”, either

a) you still always have 24 hourly snapshots at any one time, but instead of those last 24 hourly snapshots corresponding to the last chronological 24 hours, your last 24 hourly snapshots would then extend back in time for an arbitrary longer length of time, but you would know that each one was a genuinely new snapshot (i.e. because something has changed)

or

b) you would have a minimum of 1 and a maximum of 24 hourly snapshots extending over the past 24 hours at any one time.

And I believe that, in the context of a Python script, for example, you could probably implement this by using the “diff” command, although it would be a bit tricky, arguably for little benefit.

Does anyone know which approach oli-h’s solution takes? I’m assuming b).