Do incremental off-site backups require to download previous backup(s)?

cdhowie · December 12, 2020, 8:20am

By default the cache is stored in $HOME/.cache/restic. This can be changed with the --cache-dir flag, or the cache can be totally disabled with --no-cache, which is probably preferable when the repository is on a locally-attached disk; in that scenario, the cache is just a waste of space (and expensive write IOPs).

Yes, with one correction – files are split into chunks and each chunk is stored as a separate “data blob” in the repository. If you replace “files” with “blobs” then you are spot on.

Yes. The basic prune operation is just mark-and-sweep garbage collection, but since packs are immutable, any packs containing garbage objects have to be rewritten as new packs without the garbage. Prune is basically:

Crawl all snapshots, marking used objects.
Create a set of objects to repack, which are all used objects that share a pack with an unused object.
Create new packs out of this set of objects and upload them to the repository.
Rebuild the repository index.
Delete the packs that were rewritten.

Note in particular that nothing is deleted until after everything else is done, to ensure that the repository is always in a consistent state if the operation is interrupted for whatever reason (SIGINT, restic crash, power cut, etc.). Restic allows duplicate objects; any duplicates left over from an interrupted prune will just be considered garbage by a future prune invocation and will be removed then. (Indeed, restic must allow duplicate objects since concurrent backups are permitted, and there is no coordination between backup processes. It’s possible and likely that running multiple backup operations in parallel will introduce duplicate objects.)