Local hash verification when checking a remote repository


#1

I’m interested in the use case in which there’s a remote repository that’s accessible only over a relatively slow link (or one with metered data).

Running restic check --read-data to check for “bit rot” in the repository is slow and expensive in such circumstances. For backends such as S3 or B2, this probably isn’t necessary, as the files will be verified regularly as part of the service, but for unmanaged services e.g. an off-site machine, or a low-cost VPS, it would be good to be able to verify all the data from time to time.

You can do this by running another instance of restic locally to the repository. However, this requires you to trust the remote location with an encryption password, allowing an attacker with access to that machine to decrypt the entire archive. It would be good to avoid this if possible.

Since you don’t need the key to verify that the pack filenames match their SHA-256 hashes, it struck me that it would be useful if the Rest Server could do this, which would allow restic check --read-data to outsource the reading of all the pack data to the Rest Server, greatly reducing the traffic and increasing the speed. I’m not sure which HTTP verb would be appropriate, though.


#2

Hm, interesting idea. But if you run the REST server on a machine anyway, why not just run sha256sum on the files there? Here’s a very simple program to do that concurrently: https://github.com/fd0/psha


#3

Thank you. Yes, I was planning to do that – it’s great that the file naming convention makes it so easy! It just struck me that this might be a relatively common use-case, and hence a sensible feature suggestion.