Restic reports files as new after adding directories to include in a snapshot

Hi there,

Firstly, thanks for providing some great software!

I have a question regarding snapshots. I am running restic 0.9.0.

If I init a new repository and then do a backup with the command:

restic backup —verbose=9 —files-from files-to-include-in-backup.txt

all looks ok and restic reports that all files added to the repository are “new”.

The file “files-to-include-in-backup.txt” looks like this:

/bigdata

If I run the restic command again, it now runs much faster and restic reports that:

  • all files under /bigdata are “unchanged”.

If I now add another line to the file “files-to-include-in-backup.txt” so it looks like this:

/bigdata
/smalldata

and run the restic command again, then it reports that:

  • all the files under /smalldata are “new” (which I was expecting)
  • all files under /bigdata are “new” (which is not what I was expecting)

So my questions are:

  1. Is this expected behaviour? In particular, after adding the /smalldata line to the file and running the backup, I was expecting the files under /bigdata to be reported as “unchanged”.

  2. Does the repository now have two copies of my /bigdata directory?

Thanks,

John Isles

It does look like the expected behavior, although it might not be the most reasonable one.

Restic groups snapshots by “directory”. Your first backup has directory “/bigdata”, while your second backup has directory “/bigdata + /smalldata”. Since these two values are different, restic reports all files in your second backup as new.

However, your data is not duplicated in the repository. Restic can deduplicate across directories. You should be able to see this from the reported Added: <size>.
Or run restic diff to check.

And related: When this PR gets merged, you’ll be able to find out precisely how much deduplication is happening.

You can use --parent snapshot-id to circumvent this behavior and use a certain snapshot as anchor.