How does Restic manage deleted files?

Hello there,

new user of Restic here. I just set up my new backup system with Restic. I am very satisfied with it’s possibilities, flexibility and easy of use. Most of all, I am impressed by it’s speed.

Using the documentation I already learned a lot and could answer all my questions that came up during my implementation phase.

There is only one topic left: How does Restic handle files deleted on purpose? I had other backup systems that restored all files that ever existed. Even when they were cleaned away because they are no longer wanted. After a restore one had to manually delete all unwanted files again.

I found out that Restic handles this much better by a simple experiment:

  • create a test file
  • run a first backup
  • delete the test file
  • run a second backup
  • check the two snapshots for the test file

As I expected, the test file was present in the first snapshot and missing in the second one.

Okay, now I answered my question myself. But… I wonder how Restic manages this!
Sure, if a new file gets created, its blocks are new and can be added into the deduplication engine. But if a file disappears - how does Restic notice that a file formerly present is gone?

Welcome here @nestolea !

Well… The short answer is that Restic does not notice this at all. Not when backing up anyway, to be more specific.

References — restic 0.18.1-dev documentation says:

A snapshot represents a directory with all files and sub-directories at a given point in time. For each backup that is made, a new snapshot is created.

So each snapshot only contains information about stuff present when a specific backup was made. There is NO information in the snapshot about deleted stuff.

(If the snapshot would have contained information about deletions, that would basically break the concept of “snapshot”. Reason: Information like “a file has been deleted” must by definition refer to some specific earlier snapshot when the file was present. That would mean that the “snapshot” would in fact NOT be a snapshot, but a kind of incremental log instead.)

So, does this mean that a Restic repository is going to grow indefinitely? Yes, unless you tell it to prune and forget. Read more about that here: Removing backup snapshots — restic 0.18.1-dev documentation

Hope this gave you a better understanding of how Restic works. Don’t hesitate to ask for more clarification if you need it! :slight_smile:

2 Likes

Hello @martinleben ,
thanks for your precise answer, right to the point!

I somehow expected an answer like that, because otherwise the logic about snapshots would not make sense. Exactly like you explained.

Rather I failed to believe that the check through all files can be that fast. A Restic backup of my files only takes 10 seconds for 111.000 files, 76 GiB. OTOH a simple “find -mtime 1” takes 55 seconds to only search through my files for new ones.

This speed of Restic is unbelievable! I just could not imagine that it scans through all my files and compares what is there to the ones already backed up (encrypted and deduplicated!) in such a short time. Still hard to comprehend for me… :astonished_face:

1 Like

No, that is not what restic does. It would be wonderful if it did, but no, it is too god to be true. :slight_smile: It compares metadata in the cache with metadata in the filesystem.

Backing up — restic 0.18.1-dev documentation says:

uses a change detection rule based on file metadata to determine whether a file is likely unchanged since a previous backup. If it is, the file is not scanned again.

So if the metadata is identical to when backup was last run, restic does NOT scan the files. So if you have files which are modified without metadata being updated, you need to use:

--force: turn off change detection and rescan all files.

2 Likes

Got it. So it scans the metadata of the filesystem only.
But the speed is still impressive compared to my find example, which does the same.

2 Likes