Possible backup bug? Or bad backend?

I do believe this was all due to #4523. I always use max compression. And this was a fairly substantial repo. I’d wager my chances of hitting the bug were higher than the average user. Could be wrong, but I’m very happy this bug was found, because this issue was making me paranoid lol

Only thing is, I never did find any errors in my repo… but I’ll do more testing with v0.16.1 and hope for the best.

3 Likes

Keeping my thumbs crossed!

1 Like

The data corruption issue in #4523 is 100% reproducible. That is, the problem won’t randomly disappear like it has happened here as far as I remember.

check --read-data reports any damaged caused by #4523. If check does not complain about blobs with an unexpected hash, then you’re not affected by this specific issue.

1 Like

I took a look at the pictures and took a look at their hex dump. Whilst I did not parse every line, I noticed that the pictures seemed to be the same until a specified byte, after that it seemed like, there is something very special going on. So the first two times 8 bits are a conversion from 0 to 8, then there are four matching bits. But by far more interesting are the following 8 bytes, which are in their original form just 0. But in the corrupted form they are the following:
“228a 228a a288 8a28” , which seems to me either an error code, which was not caught by restig or a bit of randomness (like reading out of memory or corruption by external factors–> bitflips induced by whatever).

So Let us take a look at 8a28, which has an occurrence of 10 in the correct file and 18 in the corrupted form. But now here is the interesting part: At byte 0x2ac there is this 0x8a28 occurring in the correct form preceded by 0xa288, every other occurrence is not preceded by this. But in the corrupted file, there are these two bytes occur quite often, and almost always it is preceded by at leats 0xa288, and many times it is accompanied by the other 4 bytes mentioned

So, as my time is running out, for now, I conclude that this is probably not true randomness and I can imagine that there is corruption being somewhere while reading. For me, it looks like either an out-of-bound read or an error code not being caught.

DISCLAIMER: My analysis is based on grouping the files in two pairs and not as I would like to on a bit level, as that may catch more symmetries, as for example a bitdrop somewhere. But as there are approx 4KB missing I don’t know. Also, the ending does not seem to match on a bit level which would rule out a bitdrop/insertion. But I just wanted to note, that my analysis is not prone to errors.

1 Like

:rofl: Same here! But I’m using Restic and Duplicati just in case

1 Like

Shoot. I was hoping that was it. I haven’t been able to reproduce this issue. The drive I was backing up to appears healthy. No reallocated sectors, no I/O errors, no UDMA errors.

It’s not my primary backup, so I’ve fully overwritten the backup several times (~15.5TB), using rsync with --checksum on a second pass for verification. Can’t get it to do anything out of the ordinary. So I don’t know what it was… :man_shrugging:t2:

Ha! I bet this was the compression bug fixed in 0.16.4. I always use max compression for non-local backends. :+1:

EDIT: Oh wait, I just realized that’s probably the same thing as #4523 lol. Forgot all about that. Oh well. Knock on wood, haven’t had any other issues since upgrading. :man_shrugging:t2:

1 Like

Only 0.16.0 and 0.16.3 are affected by compression bugs. The bug in 0.16.3 does not exist in 0.16.2.

1 Like

Ah, I was on 0.16.0 - but you said that probably wasn’t because of that compression bug. But I haven’t had the issue since, either - knock on wood. Oh well, we may never know!