Show per-file usage in snapshot

I tried restic diff and it is very useful however it shows

  1. Modified paths.
  2. Total data difference.

I was wondering if there is any way to show the data difference for each file. It is one thing to know that a 1GiB file changed and another thing to know if 1B of it is new or if the whole thing is new.

There are definitely complications here as new data may be added to two files at the same time, so each file would have new data but the repo overall would have that new data only once but I still think it would be very useful.

There’s none. We’re trying to limit the scope what restic should be able to do (to prevent feature creep), so restic is a backup program and not a binary diffing program :wink:

What you can do though is use the restic mount command and then use bindiff or some other tool to look at the differences. It’ll be not as efficient as restic could be though.

1 Like

The difference here is that some other tool won’t know how restic chunked the files so IIUC the result won’t be an accurate picture of the additional space usage.

I do appreciate the resistance to feature creep but this would be quite useful for understanding which parts of my backup are expensive.

What would be (relatively) easy to implement: How many blobs of the file differ and what size these blobs do have. But this could still mean that 1B differs and the report says, that 1 blob with 3MB is different.

I would also strictly advise against doing a byte-comparison or anything which needs to actually read the data. restic diff should stay a command which only works on metadata!

But this could still mean that 1B differs and the report says, that 1 blob with 3MB is different.

That is exactly what I am looking for. Basically I want a way to identify files that are changing in a way that is bloating my backup size. Then I can consider other options for these files vs files that are for example simply getting a few bytes appended which barely adds extra space.