I had a really hard time wrapping my head around the meaning of the keep-within-, possibly because it was basically doing the same as keep- but possibly also because the presenting didn’t quite fit with my brain.
Also the information that it works like a marking algorithm, marking all keepers and deleting the rest seemed to be missing, or a bit too implicit for me.
I tried formulating this in a overview section where the lines are drawn, before digging into the explicit syntax cases. I also explain the similarity and difference between the two formats which I think will be
helpful to others.
See commit changes
I don’t know how much effort is required just for a document update, but certainly the first thing to check is if I actually understood it correctly.
Please open a pull request that makes it easier to discuss proposed changes. I like the introductory paragraph, but the time frame based explanations were somewhat confusing for me. It might be more useful to expand the examples to properly show the difference between --keep-* and --keep-within-*.
While I agree that the forget options can be tricky to understand as the differences are subtle, I find the proposed changes hard to understand. What does it mean to do so “for n repetitions”, for instance?
I think the main idea to keep snapshots for a number of time intervals is quite easy to understand. The main difference between --keep-* and --keep-within-* is whether snapshots from older time intervals are kept instead if some of the recent time intervals have no snapshot.
If there is a snapshot every day except yesterday,
--keep-daily 7 will keep one snapshot from last week instead (to get 7 snapshots in total),
--keep-within-daily 7d will just end up with 6 snapshots because there is none for yesterday.
I explained the difference with an easy example here:
Further comments on the proposed changes:
The cross references (like “notes further down”) are well-intended, but in my opinion, they make it harder to read because the reader needs to decide if he wants to jump there or continue reading the current paragraph. The text should have the best possible information flow.
There would be some minor corrections (“pr” instead of “per” etc.).
I’ve found it simplest to read “–keep-within-hourly 1d” as “keep all hourly snapshots within 1 day”.
IMO it would have been better named “keep-hourly-within” then it reads much more clearly.
As for saying it is basically doing the same as keep-*, well that is true if you’re backing up very regularly – then both options look the same.
“keep-within” really comes into its own if you’re backing up irregularly or missing scheduled backups (typical for laptops). It reflects the idea that it’s not the number of past revisions, it’s the coverage over a given period of time that matters. (The docs also mention another use for this in terms of security and compromised clients).
@noeck Did you open a PR yet? What’s the URL to it? That’s where discussions around your suggestions should take place, as it’s much easier to track and pinpoint parts of it as they’re being discussed there.
No. I have no intentions to change the documentation. There is a pull request by @arberg. I just commented that a) I find the current documentation understandable (@sc2maha sums it up nicely) and b) that I once came up with an example in another thread that could be used if someone finds it useful.