This is a follow-up to my “corrupted files after restic restore” issue.
So, I woke up at 3AM with the idea of trying to track down this problem by looking for differences in those files in the ZFS snapshots I store here (I have all of them saved). What I found, perhaps predictably, is that the ZFS snapshot made at the moment
restic backup managed to save its first snapshot contained these two files with EXACTLY THE SAME CONTENT as
restic restore restored them (from a much later restic snapshot).
So, what was happening was, obviously, that these files’ contents had changed and their mtimes had not (causing
restic backup to miss their changing) . I checked them on the snapshots with
stat(1) and saw that, albeit mtime hasn’t changed, ctime had!
I managed to reproduce the problem by opening a .xls file (like the ones involved in my problem) with MS Excel, moving around the file a bit without changing anything, and then exiting MS Excel without saving. Apparently Excel saves something inside the file (which doesn’t change its “visible” content) and rolls back mtime to “hide” it… Anyway, I bet this is a very common problem!
I then googled for “restic ctime mtime” and found this and then this; to sum it up: it’s already a known problem with a fix in master… which was commited after v0.9.5 was released, so it is NOT present on any released restic version yet!
What I did was to manually patch the respective commit into my own restic source tree (which already implements other modifications and workarounds, that’s why I simply didn’t download master) and compile to produce a new binary. I expect that, on its next
backup run, this patched restic will detect the changed files and happily upload their contents, fixing the issue (apart from the previous snapshots where they will still show with the wrong content, of course).
So, if you or your users run MS Excel (and possibly other programs playing stupid tricks with mtime), be very afraid… and apply the above fix ASAP, and STRONGLY suspect that all your backups made with restic <= 0.9.5 are ‘corrupted’ in the sense of not accurately containing changed files whose mtime was played with.
Hope this helps someone.
PS: I think the seriousness of this goes beyond just driving crazy someone that checks everything (ie, SHA checksum for restored files) like me, as changes that should have been backed up are being missed – if anyone needs to recover one of these files from backup (ie, operator error, disaster recovery, etc) he/she will, without any warning, just get an older, outdated file :-/ and in such a scenario, ie after the original file is lost, it is lost forever: there is obviously no way to recover it from a restic backup