Does restic keep a hash of an entire file?

I would be very curious to learn if restic keeps a hash of an entire file it is backing up? Or a collection of hashes, each hash representing a specific chunk of that file? Thank you.

Restic hashes pretty much everything, including the different chunks that each file being backed up is split into. See the reference here: References — restic 0.18.0 documentation

The question is quite specific: is there a hash of the entire file that can confirm it has been correctly restored? I might be missing something, but relying on hashes of file chunks seems less reliable as proof, since it inceases complexity and therefore the risk of software bugs.

Someone else will have to answer that, then. The restore command explicitly verifies what you ask it to restore, but exactly how that is done I cannot say. It’s in the source code of the restore command.

1 Like

If you backup a file that is large than the chunk size, the content array in the metadata does not contain a sha256 hash of the entire file, but rather of the chunks. So, no restic does not explicitly keep a hash of the entire file.

user@vm:/tmp/lol$ restic cat blob 3192cc4413266a3bd0d50c174eb6697df80d62e5f596f356d37c03a059fc5df4 -r .
enter password for repository: 
repository 48f1f127 opened (version 2, compression level auto)
[0:00] 100.00%  2 / 2 index files loaded
{"nodes":[{"name":"abcdefg.zip","type":"file","mode":436,"mtime":"2025-01-19T19:39:15.417723715+01:00","atime":"2025-01-19T19:39:15.417723715+01:00","ctime":"2025-01-19T19:39:15.420723738+01:00","uid":1000,"gid":1000,"user":"user","group":"user","inode":45138072,"device_id":64513,"size":25491843,"links":1,"content":["23a713d00706ac7566d5167f15e6adf28728fae794e1efc9b0d3f42c43076e33","9975c511875c1d949456f2d9185e54c2bc69756a4adfc87b53f248565229a31b","7b76fbdc43fa34a44d75ce1756b5c4860019c5bdf3bb884aca9cdb71e1334950","3ae22f837c61f159d7ea7f5ea8d489828952ff2e8699cfe3d45e06dc3d37f42b","95832e4cabb117ebb0ed01c469023c235004f8a8854e50d5af1b1fa594770f74","224603c27307af6b7a4d0d7cf093fe3dc8f8b4d282d07239e345637c96238c2c","c9ab2f51c9cc29d02dfa74783323c2c642b5235340dfff1e4f64960856df4ac0","9207cf58a93207fe74c58bd1ba5cc3bc9ca71d74cf3b6ddf266597c4804f3fe2","39e7ddcff21642ea8106251553cce56f032e577aec6b50600ac63d7e49115ad0","aa1a362d481362cd9fd62647d9e891b8d85efae809987ff6376403fe0906091f","e66abaa112dcf24125950a8fdb5d16607bd2d6b18764a87a0d8bbf8709c86f52","32ad0837084ed39a7e8e256f76362bfe7b2bd3e55b32c629e30768e8e11b0f18","3d7f83c5be24db0b5c340700d7a940ffddbaa8d2cd06960f2d80b6b38585f86c","5333439f8438d362a533c92eccd5f9f608b47ac9d96fa9b05e230f92a44376d5","a546e371dac985cfcee413900649d8dc59781983e680e86747459b1e386324d8","dd8ea4638a2ec10958dd65a8e88a305b8a4c95cc53f2690be2a386cdd165c8e6","c26df9e4daf12b3ab3e3e12d245fa64ff2a0b27e2398598a74c81ed3cdc17f2a","658aed5b3cb569f7e14e91191e55022d1598d5df80564ac43791c9c6f25c78f3","6db0cb45b6108e713c18ae4560cde5e5160d950bffd600a722656463d8523dd5","ba1f82786a2df46bb15b464baa97f8e3cbc0210ea61d5b8f1277bb35b51bf43c","9aa71ad8c163a3c75e87562711760508d2e31200202ff349a7f3f1c7193e2298","f7382af93571bf9f4e4ed32b0301f6eebca15ef847e44f0bfaed8ca3512bbccc"]}]}

1 Like

@cryptonaut - thank you very much for illustrating the point clearly.

@mbrijun: It’s true that the hash of each file is not stored by itself, but the hash of the tree, which stores the hash of the pieces of the file, is. So, from the point of view of your question, full integrity is guaranteed, because the tree hash will be corrupted if the hash of any part of any file or anything in the directory is different from the original.

$ restic cat snapshot 39f69e5d
repository 1d111acb opened (version 2, compression level auto)
{
  "time": "2025-04-20T22:25:42.8617771+02:00",
  "parent": "4bc34a8b450fbc3e0cb54aad3635f81ade64aba5445c0d0b2e312cfd2e55638c",
  "tree": "cac7bb274e338b0e74a9a923df8d02aa32e419339c01b64bfa0ba54a108be0e6",
  "paths": [
    "/store/carl"
  ],
  "hostname": "stargate",
  "username": "root",
  "program_version": "restic 0.17.3",
  "summary": {
    "backup_start": "2025-04-20T22:25:42.8617771+02:00",
    "backup_end": "2025-04-20T22:25:48.855342005+02:00",
    "files_new": 0,
    "files_changed": 0,
    "files_unmodified": 17015,
    "dirs_new": 0,
    "dirs_changed": 757,
    "dirs_unmodified": 0,
    "data_blobs": 0,
    "tree_blobs": 3,
    "data_added": 48048,
    "data_added_packed": 11802,
    "total_files_processed": 17015,
    "total_bytes_processed": 14772919017
  }
}

$ restic cat tree 39f69e5d | sha256sum
cac7bb274e338b0e74a9a923df8d02aa32e419339c01b64bfa0ba54a108be0e6  -

The dock says:

Trees and Data

A snapshot references a tree by the SHA-256 hash of the JSON string representation of its contents. Trees and data are saved in pack files in a subdirectory of the directory data.

1 Like