Minor functional issues

chastings · November 28, 2020, 9:23am

There isn’t really a bugs section of the forum, and these certainly aren’t features or ideas. The first is probably a bug, and the second… well, I’m not sure what to make of it.

Thoughts?

1. recursive repo backup

An error in a command I typed resulted in something like this:

backup --repo ./repo ./repo

Yes, it’s a bit crazy, but it worked… sort of.

The pct progress disappeared as it passed beyond 100%. I was expecting an infinite loop, but it finally finished adding 1399 files and 6.7GiB, while the initial repo was 809 files and ~4GiB.

I guess if you want to support this sort of recursive backup, it doesn’t seem like it’s doing the correct thing. Maybe it would be better to detect and prevent this case, though?

2. copy with different chunker params

Not sure if this is a docs bug or a feature request, but I’m not sure most users will pick up on the fact that copying between repos with different chunker params can result in the data being transferred AND stored twice. I didn’t, and only a message pretty deep in a thread made me realize.

The online help states that this will break deduplication, but it might not be clear the consequences of this behavior.

The feature request part: add an error (and an override flag) to make sure people don’t do this unintentionally.

rawtaz · November 28, 2020, 3:40pm

There has never been any intention to support someone backing up the same repository as they are backing up into. That use case doesn’t make any sense.

This is one of probably thousand small things that users could accidentally do wrong, and personally I don’t think we should add code to deal with things like this.

It happens almost never from what I’ve heard over the years, so it’s hardly a big problem. If someone makes that mistaken, then they’ll realize it and it won’t be the end of the day. It’s probably extremely rare.

It might be true that the help section for the copy command isn’t clear enough to an average use. I’d think that “may break deduplication” is pretty clearly telling you that your data may not be deduplicated and that this means your data may be duplicated - how is that not obvious?

However, it may be true that we should be more specific or elaborate on the actual practical effects of what the current text is saying, explaining (in short terms) that the same backed up data may be stored twice. That’s indeed something to improve.

I don’t think we should add an error and option to override this, when we already have (or will have) it explained in the help for both the command and in the manual.

rawtaz · November 28, 2020, 6:56pm

@chastings I have opened PR https://github.com/restic/restic/pull/3136 to improve the docs, can you tell me what you think?

The changes in the manual are that the two main things (double transfer and possibly double storage) are separated into their own “important” box (instead of previously one common “note” box), and that I reworded the text in an attempt to make it much clearer that these things may happen.

chastings · November 28, 2020, 7:21pm

Yes, that description is much clearer! And emphasizes that you may incur a storage penalty.

Thanks.

chastings · November 28, 2020, 7:33pm

I agree that it shouldn’t be supported and that it’s a pretty nonsensical use case.

But I think it’s a best practice for software to throw an error when it ventures into unsupported behavior, even if it sort of works and doesn’t really harm anything. Even if the condition is rare and the result of a misconfiguration, user error, or random chance.

Robust software should detect these conditions and alert the user to the dragons – small or large – that lie ahead.

rawtaz · November 28, 2020, 11:48pm

It’s not unsupported really. If you want to back up your repository, then by all means do that

The reason you saw more files and data backed up than you had initial is that restic backs up the files by recursively traversing the folder you asked it to back up. Since during the backup it continously adds new files (blobs, packs, …) it backs some of them up too along the way, hence you get higher numbers.

PS: I do see what you mean, and restic could indeed try to figure out if any of the paths that it’s asked to back up are part of the repository it’s storing the backups in. I think your post is the first time I have ever seen any mention of this situation happening though. And unless I’m mistaken you stumbled across this when doing some testing rather than regular backups?

rawtaz · November 29, 2020, 1:41pm

The documentation PR I mentioned earlier has now been merged. Thanks for highlighting that part!