I think I need a clarification of the operation of the forget command. I will keep the last 7 days of backup, the last 4 weeks and the last month, so I put the following command in a cron after the backup:
Restic kept 7 snapshots from a single day and 1 weekly snapshot. Snapshot 60c2e53d has a different path than the other snapshots and thus is treated different.
You can configure how restic groups snapshots together, by using restic forget --dry-run --group-by host it will group together snapshots of different directories and apply the policy together.
Make plenty use of --dry-run while developing retention policies
In fact I misunderstood the options group-by in the doc (and not read the description of this option in the help: restic forget --help). Now everything is clear sorry for the noise.
Ok, maybe we could replace the following paragraph:
The grouping options can be set with --group-by, to only group snapshots by paths and tags use --group-by paths,tags. The policy is then applied to each group of snapshots separately. This is a safety feature.
by
Grouping snapshots can be defined with the ββgroup-byβ option. This option accepts the following keywords βhostβ, βpathsβ and βtagsβ that can be combined using the comma separator. For example: to select only snapshots with the same paths and tags: ββgroup-by paths, tagsβ. The policy is then applied to each snapshot group separately. It is a safety device.
Iβve reproduced it locally, it behaves the same for me. Internally, restic uses time.ISOWeek() to get the week number, which in turn implements the ISO 8601 standard for week numbers. It defines that a week starts on a Monday.
The reasons for keeping the snapshots are as follows, starting with counters daily=7, weekly=4, and monthly=1:
2018-07-02 is kept because itβs a daily, weekly and monthly snapshot (counters daily=6, weekly=3, monthly=0), this date starts a new week (calendar week 27) because itβs a Monday
2018-07-01 is kept because itβs a daily and a weekly snapshot (counters daily=5, weekly=2, monthly=0), itβs a Sunday so itβs a different week as 2018-07-02 (calendar week 26)
2018-06-30 to 2018-06-26 are kept because they are daily snapshots (counters daily=0, weekly=2, monthly=0), all within calendar week 26
2018-06-24 is kept because itβs a weekly snapshot in a new week (calendar week 25) (counters daily=0, weekly=1, monthly=0)
2018-06-17 is kept because itβs a weekly snapshot in a new week (calendar week 24 (counters are all zero here)
At this point, restic decides that all other snapshots (including the one on 2018-06-10) should be forgotten.
For the calendar weeks, you can use cal as follows to show them e.g. for June and July:
$ cal --monday --week 6 2018
Juni 2018
Mo Di Mi Do Fr Sa So
22 1 2 3
23 4 5 6 7 8 9 10
24 11 12 13 14 15 16 17
25 18 19 20 21 22 23 24
26 25 26 27 28 29 30
$ cal --monday --week 7 2018
Juli 2018
Mo Di Mi Do Fr Sa So
26 1
27 2 3 4 5 6 7 8
28 9 10 11 12 13 14 15
29 16 17 18 19 20 21 22
30 23 24 25 26 27 28 29
31 30 31
I got this information by fiddling around with the source code and including more debug output, but thereβs an issue about adding some output which explains to the user why restic keeps/forgets snapshots: https://github.com/restic/restic/issues/1235
More details on the snaphost is always a good thing.
I also have a remark on how to count days, weeks and months. If I understand the logic if I want to have 4 weeks of backup (in addition to the current week) I must ask to restic to keep 5.
However, my usual way of counting backups is to count the retention periods in addition to the current period.
Thatβs what I thought I misunderstood.
Has the ββexplainβ feature been released? Iβm having some trouble understanding resticβs snapshot βforgetβ behavior and the additional annotation from --explain would be a big help.
The release you identified doesnβt understand --explain because displaying the reason became the default behavior. (I dug that out of the github history). So, with this beta, it unconditionally displays the reason:
Hi,
I have a comment about removing snapshots. In the example below I understand perfectly why the snapshot c74a4169 was removed, but since it was the oldest snapshot and the last monthly snapshot is not done yet I think it would have been better to keep it , do not you think ?
Applying Policy: keep the last 7 daily, 5 weekly, 3 monthly snapshots
snapshots for (host [ct17102601], paths [/]):
keep 10 snapshots:
ID Date Host Tags Directory
----------------------------------------------------------------------
058b749f 2018-09-23 19:15:02 ct17102601 /
d2ac5901 2018-09-30 19:00:01 ct17102601 /
9911ee71 2018-10-07 19:29:01 ct17102601 /
89cfce01 2018-10-09 19:00:02 ct17102601 /
0c230b68 2018-10-10 19:02:01 ct17102601 /
06136e65 2018-10-11 19:09:01 ct17102601 /
0d5b3064 2018-10-12 19:02:01 ct17102601 /
3c4f425f 2018-10-13 19:21:02 ct17102601 /
a05fad00 2018-10-14 19:08:02 ct17102601 /
525c23be 2018-10-15 19:25:01 ct17102601 /
----------------------------------------------------------------------
10 snapshots
remove 2 snapshots:
ID Date Host Tags Directory
----------------------------------------------------------------------
c74a4169 2018-09-16 19:06:02 ct17102601 /
98aff015 2018-10-08 19:22:01 ct17102601 /
----------------------------------------------------------------------
2 snapshots
2 snapshots have been removed, running prune
counting files in repo
building new index for repo
[11:25] 100.00% 10895 / 10895 packs
repository contains 10895 packs (776932 blobs) with 52.474 GiB
processed 776932 blobs: 0 duplicate blobs, 0B duplicate
load all snapshots
find data that is still in use for 10 snapshots
[4:11] 100.00% 10 / 10 snapshots
found 769421 of 776932 data blobs still in use, removing 7511 blobs
will remove 0 invalid files
will delete 113 packs and rewrite 876 packs, this frees 1.650 GiB
[9:48] 100.00% 876 / 876 packs rewritten
counting files in repo
[1:39] 100.00% 10626 / 10626 packs
finding old index files
saved new indexes as [a11fdf20 0d46097f 4a405a59 590b0f8f]
remove 9 old index files
[0:27] 100.00% 989 / 989 packs deleted
done