Unable to rewrite snapshot ID xxxx: cannot encode "/" without loosing information

MichaelEischer · August 26, 2024, 1:09pm

I’ve documented it in rewrite: Document handling of "cannot encode tree" errors by MichaelEischer · Pull Request #5017 · restic/restic · GitHub . That doesn’t mean that this limitation has to stay around forever, but at least right now I don’t see an easy way to lift it.

The specification currently focuses on the most important aspects, although I have to agree that it could be a bit more detailed regarding the snapshot/tree format. However, I still doubt that it’s worth to document everything down to a level that guarantees perfectly identical results (that would require a level of detail that far exceeds that necessary to implement full read/write support of the repository format)

Neither reason can justify a failed backup run. Yes, it’s not ideal to not deduplicate some tree blobs in that case, but that’s still much better than not having a backup.

My perspective on the rewrite command is rather different. If a user runs rewrite to remove some file path, then they should be able to trust restic to not randomly lose metadata in the process.

backup should just use option 2 and let the normal deduplication handling take care of duplicates. The other options just introduce special cases.

For rewrite option 2 would result in data loss which isn’t acceptable IMO. Option 1 doesn’t work when anything in the snapshot has to be changed as this always results in a change of the root node. That node was the one that could not be encoded in the examples above. Thus only Option 3 remains.

alexweiss · August 26, 2024, 2:38pm

But randomly losing metadata is exactly what rewrite still may do currently, see below…

Just note that you are using option 2) when rewrite does change a tree: If this removes or modifies just a single entry, additional metadata of all remaining entries will be removed without even noticing.
I do understand that you are not performing a check to prevent this - there is simply no possibility to check this if your check is limited to just comparing the hashes of serialized tress…

IMO it just doesn’t fit together: On one side (rewriting completely unchanged trees) you are almost paranoid about loosing existing additional metadata, on the other side (rewriting changed trees, writing new trees by backup with parents) you basically don’t care about the existing metadata.

MichaelEischer · August 26, 2024, 4:17pm

That is plain wrong and I’m growing tired of this discussion. The code checks that it can exactly reproduce the original tree before modifying any entry in the tree. Thus when removing a single entry, all other entries are left unchanged.

alexweiss · August 26, 2024, 6:45pm

Sorry if I was getting something wrong here and misunderstood what rewrite was in fact doing. As it is not really documented I had to guess and tried to find conclusions.

And I would even more like to apologize that I was trying to add another perspective to this topic which seem to be not wanted.

As I agree that the tone of this discussion went strange, I’ll also put my energy into more fruitful things
Cheers