Forget and prune all snapshots more than 90 days old

I currently create restic backups by running the following commands manually (on Linux - specifically Ubuntu 22.04 with the latest restic 0.14.0):

restic cache --cleanup
restic backup -r "${REPO}" --password-file "${PASSWORD_FILE}" --files-from "${INCLUDE_FILE}" --exclude-file "${EXCLUDE_FILE}"

Where:

${REPO}: Full path to backup location. This is always a USB drive mounted under the same filesystem (at /media/username/drive-id).

${PASSWORD_FILE}: Plain text file containing the password for this repository, so I don’t have to type it (this is on a desktop that only allows local logins, and I’m the only user, so I’m not too worried about storing the password in this way).

${INCLUDE_FILE}: Plain text file with one directory per line, containing all the directories I want to backup (I selectively do this as I have some very large directories that I don’t want backed up, such as my Steam games). I use this file with other backup systems such as Borg and tar so that I only have to configure the list in one place.

${EXCLUDE_FILE}: Same as include file, except these are sub-directories which I want to exclude (e.g. I might include /home/user/stuff but exclude /home/user/stuff/big-directory).

This works fine and I’ve been running it for a couple of years, occasionally restoring data without any issues. However, I want to forget and prune (i.e. remove from my backups) any snapshots that are older than 90 days. This is really important for me as I have sensitive data on my desktop and if I receive a request to delete it I need to know that it will be gone from my backups within a given time period.

At the moment I run the following commands to forget (mark for deletion), prune (delete) and check (just in case anything goes wrong):

restic forget -r "${REPO}" --password-file "${PASSWORD_FILE}" --keep-within 90d --prune --verbose
restic check -r "${REPO}" --password-file "${PASSWORD_FILE}"

I usually see some output from the above commands as I backup several times a week, and therefore there should always be something to remove. However, if I run restic snapshots (same repository and password file) there are lots of entries which are more than 90 days old, e.g.

552a4e83  2019-09-06 10:15:03  morbius 
95e55626  2022-06-27 19:03:37  mondas
7fd6fe15  2022-09-21 18:43:09  mondas 

The morbius snapshots I can understand as they are a different host (my desktop is mondas), however I can’t see why there are entries for my desktop host that are more than 90 days old.

The other strange behaviour is that there is a 90 day gap between where I would expect the snapshots to end and the next set of snapshots. For example, in the above output 7fd6fe15 is within the 90 day window today, so I would expect it to be listed. There’s then a gap of approximately 90 days before 95e55626, and from that date backwards there are snapshots going back as far as 2019.

Is there something else I need to add to ensure that all snapshots older than 90 days are removed? I want to do this across all hosts, because I no longer have access to morbius and therefore can never run restic with that hostname.

You seem to be leaving out some pieces of information, probably in order to not be too verbose. Can you show the full listing/lines of the snapshots you think should not exist?

I’m not sure what you mean by that? I’ve deliberately selected the snapshots closest to where I think the problem is, otherwise the post would go on for pages and pages (I have snapshots back to 2019 - the list is so long it goes beyond my terminal scrollback buffer).

Would the list of directories to be backed up make a difference? This changes over time as I add and remove directories to the include and exclude. I could list them for the snapshots I’ve put in my original post but I would have to anonymise them as some can contain private data such as client or project names.

Very possibly, hence my request for the complete lines for those snapshots.

It would not only make a difference for the forum to answer the question, but most likely also explains why forget doesn’t behave like you expect. forget by default groups snapshots by host and paths and applies the keep/forget policy to each group. So if you change the paths, your --keep-within will be applied to each group. And be aware that the 90 days are always relative to the newest snapshot within the group!

If you want to remove all snapshots older than 90 days (relative to the newest snapshot), use --group-by host (if you only have one host) or --group-by "".

So if I’m understanding this correctly, if I remove a directory from my include files, the snapshots associated with it remain indefinitely and aren’t removed after the retention period has expired?

For example:

Day 1: a + b directories included in backups
Day 5: a removed from backup list (so is no longer included in new backups)
Day 75: Snapshots still exist with a + b as it’s still within the 90 day period
Day 100: Snapshots still exist with a + b, even though it’s more than 90 days since a was removed from the list

If that’s the case then I’ve completely misunderstand how restic manages forget policies (and it’s very confusing to retain backups indefinitely once I’m no longer backing up a directory - the whole point of removing something from the include list is so that it gets removed from the backups…)

That’s not really a great description of it. Restic, and I’d say most other backup software too, will keep your snapshots indefinitely unless you tell it to remove them one way or the other. Same thing here.

As @alexweiss mentioned, when restic grabs the list of snapshots to run your snapshot removal policies on, it takes the full list of snapshots and then groups them into groups that consisting of combinations of host and path, where these two fields are the ones you see in the output from restic snapshots (the latter being the one I was missing in your output earlier). Host and path is the default grouping, but it can be set to any combination of host, path and tags, see restic help forget:

-g, --group-by group                 group snapshots by host, paths and/or tags, separated by comma (disable grouping with '') (default "host,paths")

So if you have e.g. two different hostnames in the list, and each of that hostname has two different set of paths in all its snapshots (e.g. hostA(pathA, pathB) and hostB(pathC, pathD) ) then you restic will put these snapshots into four different groups.

Once restic has grouped the snapshots, it then applies the policy you specified to the forget command to each of these groups individually. So to take a super simple example, if you told restic to keep the three last snapshots, you would with the above example groups end up with three snapshots per group, a total of twelve snapshots since there are four groups.

If in the example above you feel that you dont care about what paths you backed up for each host, and therefore only want to apply the policy for each host (again, regardless of what that host backed up in its snapshots), all you have to do is set the grouping to use only the host field, which is done by adding --group-by host to the forget command.

This is a feature in restic implemented for various reasons, e.g. to support multiple hosts in one and the same repository (without grouping, you could easily end up in a situation where a less frequently backed up host in the same repository is getting all of its snapshots forgotten in favor of a more frequently backed up host, which would obviously be very bad).

If you aren’t happy with grouping on hosts and/or paths and need something more complex you can add tags to your snapshots (see restic help backup and restic help tag) and then use --group-by tags. It really boils down to your scenario etc.

I hope this clarifies how it works, and that there is a good reason for this feature, even if it may have come as a surprise to you. As mentioned earlier, what you probably want to do is to set the grouping to just host by adding --group-by host. From what I’ve read about your case, this should be in line with what you want.

1 Like

Thanks for the explanation. I still don’t really understand how restic is making decisions about which snapshots to retain though so I think I’ll stop using it for now.

What part of it is unclear? It’s pretty simple generally speaking so it would be more fruitful to just resolve any uncertainties you have. Policies are just about telling restic which parts of the timeline of snapshots you want to keep.

By the way, in case you don’t know already there is a --dry-run option you can append to the forget command. This will simulate the forgetting of snapshots so that you can see what it would do, but won’t actually do it. So using that you can see what restic would remove and why.

Have you read Removing backup snapshots — restic 0.16.3 documentation ?