I have folders where I don’t expect files to get updated/changed (PDFs etc.), but just new files added.
Is there any pre-conceived method to detect (data) changes in those (local) files and not overwrite their existing backup versions, but instead alert the user?
I guess I could simply retro-actively parse all snapshots before doing the weekly prune, but I wonder whether there is already something better in place.
Extended attr with hashes is poor man solution - but works perfectly with extra manual steps. If you want something totally transparent keep your data on ZFS or BTRFS storage with data redundancy and run scrubbing periodically. Bit rots will be not only detected but also fixed on the fly.
How to ensure that files modified by bitrot do not replace “good” files in the backup?
About the second question:
Note that typically you would backup using a parent snapshot. Now, with a parent snapshot present, there is a change detection algorithm which ensures that files that “have not changed” won’t be re-read, but instead the content of the parent snapshot is taken. This means, if you have a file changed by bitrot (i.e. no modification or other metadata change) and backup using a parent snapshot which has the file content in a pre-bitrot state, then your newly generated snapshot does not see the bitrot, but includes the file in its pre-bitrot state. Only if you backup without a parent (e.g, using --force), all content is re-read and the current state is included in the snapshot.
How to detect bitrot using the information in snapshots:
Note that restic does already save SHA256 for all chunks in a file and is therefore perfectly able to detect bitrot, in theory.
What you can do is run restic backup --force and then restic diff to compare the new snapshot with the last one. But note that this diff also shows files which do have “normal” changes, i.e. changes in content and metadata.
Note that restic currently does not support to diff between a snapshot and a local path. If you want to do that use rustic for it.
Actually, restic diff shows bit rot explicitly (marked as ‘?’), apparently by checking content change against missing corresponding meta data change. That’s actually quite neat and I’m using that now to check the snapshots before running the weekly prune (I also run --force and verify --read-data regularly). It would be very easy to add a ‘lock down’ functionality at exactly the same place in my script by simply checking for other reported changes that match some configured paths.