New to restic, basic question: should it rescan all files everytime?

cjke · October 30, 2024, 8:28am

I’m currently setting up restic to backup a reasonably large set of files. Size-wise its not huge, a few TB, but lots of files - think photos, work files, etc

Everything appears to be working, however, restic rescans every single file on every backup. Even when I run the backup immediately after the previous has finished.

Before I dig much deeper, is this expected?

I’m looking through the forums, the closest I found was this: Randomly Needs to Rescan All Data - #33 by cdhowie but for that user it happens randomly, and for me it happens everytime.

First run (fresh repo)

On the first run, I see the following output:

No parent snapshot found, will read all files.
Load index files
Start scan /path
Start backup /path
Scan finished 9000s

Then a very long process - approx 12 hours.

Approx 1,000,000 files at 700gb

Second run (directly after the first)

Using parent snapshot abc123
Load index files
Start scan /path
Start backup /path

Command ran:

restic -r /media/chris/drive001/backups --verbose backup /mnt/FF67-F77E/files

The source drive is an internal HDD, and the destination is an external drive if that is important. Its not a remote location like R2/S3. The machine is Mint Linux.

I’m happy to dig in further, but trying to understand if this is normal or not.

cjke · October 30, 2024, 12:01pm

I did some more digging:

If I stat the path with stat -c '%d %i %n' /mnt/FF67-F77E/files the device and inode number match the values when cat’ing the nested blob:

# get snapshot details, specifically tree value
restic cat snapshot $snapshot_id

# repeat 3 times, plucking the subtree each time
restic cat blob $tree1

Once I reach the “files” subtree, I compare the contents of the cat to the contents of the stat for the same path, and the device and inode number are the same.

Running with --ignore-ctime and/or --ignore-inode does not fix the issue.

I’m segmenting out a reduced test case to see if I can narrow down what is happening

tjh · October 30, 2024, 5:52pm

Yes, this is normal.
Restic checks to see if any files have changed.
The key good thing to see is the “using parent snapshot”

The second scan should be much, much faster, right?

cjke · October 30, 2024, 11:22pm

Yea, that is the weird thing, the second run takes just as along: approximately 12 hours.

Someone on Reddit suggested looking at the modification dates, so that will be my next step. inode and device id are the same though

tjh · October 30, 2024, 11:27pm

So wait, your first post was (Trying) to say it didn’t seem to be working properly? That wasn’t at all clear.

If you’re asking for help (not just how does a thing work) then please provide the information requested when you first clicked “Create new post” that you then deleted.

cjke · October 30, 2024, 11:54pm

Mate, that reply comes across unnecessarily rude. I understand that the forum may get some low effort posts, but surely I’m doing my due-diligence here and trying to debug what I can as much as I can?

My original question was reasonable straightforward “Should it rescan all files everytime?”
I even stated that I was new to the app, hopefully highlighting that I may have terminology incorrect (such as scan v list). I’m not asking to be hand-held, quite the opposite, I am simply asking what is expected so I can go away and figure it out myself.

Regarding the “deleted” information, I thought I provided everything requested, but I admit I missed the version number. Is it standard on these forums to leave the original templated message?

This is what the template asks for:

The output of restic version.

Apologies, the version is 0.16.4

The complete commands that you ran (leading up to the problem or to reproduce the problem).

I provided all the commands I ran, including subsequent debugging commands

Any environment variables relevant to those commands (including their values, of course).

There were no relevant env variables, so they weren’t listed

The complete output of those commands (except any repeated output when obvious it’s not needed for debugging).

I provided output of these commands, excluding the file listing itself as I consider that repeated.

tjh · October 31, 2024, 12:55am

My apologies, I didn’t mean to come across rude but yes, re-reading my reply I agree with you.
So consider that I have slapped myself on the head for being rude and again, my apologies.
In my (pathetic) defence, I’d missed the second post where you put in a lot of details.

The version is pretty key because there’s often many bugs/things fixed with newer versions obviously, yet people often pop up running very old versions. I would try with 0.17.2 just to see if the behaviour changes, though I don’t think looking at the changelog that there’s anything in there that would affect rescan.

A few more questions so that someone with more clue than me can reply:
Is the External Drive just standard ext4? It’s not mounted with any strange mount flags, is it?
Is the restic cache still available, you haven’t deleted it?

When restic runs it has to rescan all the files to check for modifications or not, but I can’t fathom why that would take the same amount of time. Have you left it to finish the full scan and it took the same amount of time, or are you just concerned it’s doing a full-rescan and are stopping it?

For example on my box that’s got 15G of data on it:

-!- ~ » restic backup -v -r s3:<remote> --exclude-file=/etc/restic.exclude --exclude-caches --exclude-if-present .noresticbackup /var /usr/local /etc /home /boot /root
open repository
repository eac59ab7 opened (version 2, compression level max)
using parent snapshot a355f7df
load index files
[0:03] 100.00%  267 / 267 index files loaded
start scan on [/var /usr/local /etc /home /boot /root]
start backup on [/var /usr/local /etc /home /boot /root]
scan finished in 12.692s: 223392 files, 15.271 GiB. <---- FULL SCAN BUT IN 12s

Files:          55 new,    71 changed, 223266 unmodified
Dirs:            0 new,    67 changed,  2557 unmodified
Data Blobs:    186 new
Tree Blobs:     52 new
Added to the repository: 101.057 MiB (21.747 MiB stored)

processed 223392 files, 15.271 GiB in 0:38
snapshot c6184166 saved

So yea, it’s normal for all files to be “scanned” but it’s just checking to see if the file has been modified or not and it does 15G in very short amount of time. I expect an external USB would be orders of magnitude slower than that.

cjke · October 31, 2024, 10:11am

Thanks Tim, I appreciate it. And all good, I probably came back too hard too - no hard feelings.

I haven’t touched the restic cache.
The external drive is just a standard ext4, no crazy mount flags
I am letting it complete, each round takes ~ 12 hours

Thank you for the extra example output.

Over the weekend, my plan of attack is:

Upgrade restic to the latest
Create a reduced test case (with far fewer files / reduced size) to more quickly iterate on testing
Try different disk permutations - for example, both source + target on internal disks, to try isolate the cause
Double and triple check inode, modification, and ctime values across snapshots
Double and triple check device labels across snapshots

Thanks again Tim, you’ve given me heaps to dig into

If I get any further, or actually crack it, I will make sure to report back!

fede · October 31, 2024, 9:43pm

Hi everyone. I came here to read to learn something more or try to help and I leave with a smile.

fede · October 31, 2024, 9:47pm

Or you can do a full test case but using Dry Runs
https://restic.readthedocs.io/en/latest/040_backup.html#dry-runs

cjke · November 3, 2024, 1:53am

Thanks all. I’m getting closer, not a solution, but the scope of the issue is smaller.

And I think perhaps tjh was on to something with mounting options, as it seems to actually happen after a reboot. By creating a smaller testing folder its easier to iterate on, and I will missing the reboot condition previously because each run was taking 12 hours.

I’m posting an update, not a solution, and I am still digging through what is happening.

In short, I have the following scenario, with the reduced testing folder:

Initial backup - lets call it snapshot 1
- This backup takes 30s (expected)
Reboot
Rerun same backup - lets call it snapshot 2
- This backup takes 30s (unexpected)
No reboot
Rerun same backup - lets call it snapshot 3
- This backup takes 1s (expected, and awesome)

Looking at the 3 snapshots in more detail, as well as their immediate tree:

Snapshot 1

cat snapshot

repository 523f75c6 opened (version 2, compression level auto)
{
  "time": "2024-11-03T12:22:16.129025383+11:00",
  "parent": "28b1fcf83a7416f5dd9dfd56c7f9b0e47be4710f1232ca45324f5b69b64cf8a1",
  "tree": "b2b90e2c63626ac41c5ece049356193ab0ab54a8ca12a0e9d322c7d5aea5f790",
  "paths": [
    "/mnt/FF67-F77E/files"
  ],
  "hostname": "gearbox",
  "username": "chris",
  "uid": 1000,
  "gid": 1000,
  "program_version": "restic 0.16.4"
}

cat blob

{
  "nodes": [
    {
      "name": "mnt",
      "type": "dir",
      "mode": 2147484141,
      "mtime": "2024-10-27T15:54:07.429999922+11:00",
      "atime": "2024-10-27T15:54:07.429999922+11:00",
      "ctime": "2024-10-27T15:54:07.429999922+11:00",
      "uid": 0,
      "gid": 0,
      "user": "root",
      "group": "root",
      "inode": 3670017,
      "device_id": 2115,
      "content": null,
      "subtree": "ad4c930daaef0652aa01ec9f25c5134bc205cdb62b54fe0efeea216532c2970a"
    }
  ]
}

Snapshot 2

After reboot, unexpectedly reruns full backup

cat snapshot

{
  "time": "2024-11-03T12:23:57.702699622+11:00",
  "parent": "e4209073606a6040b789fd85f6a9f2a236cb243050a99fff4fee17f1aafd5f30",
  "tree": "690ef13c98b1aede49a0e39fc2881d210bfc99e9637e1dcc1425714e2c357ec5",
  "paths": [
    "/mnt/FF67-F77E/files"
  ],
  "hostname": "gearbox",
  "username": "chris",
  "uid": 1000,
  "gid": 1000,
  "program_version": "restic 0.16.4"
}

cat blob

{
  "nodes": [
    {
      "name": "mnt",
      "type": "dir",
      "mode": 2147484141,
      "mtime": "2024-10-27T15:54:07.429999922+11:00",
      "atime": "2024-10-27T15:54:07.429999922+11:00",
      "ctime": "2024-10-27T15:54:07.429999922+11:00",
      "uid": 0,
      "gid": 0,
      "user": "root",
      "group": "root",
      "inode": 3670017,
      "device_id": 2115,
      "content": null,
      "subtree": "a124c19768d84ead187a50f622e47b04d07c3346f105096ef9638976ad2bf5f1"
    }
  ]
}

Snapshot 3

cat snapshot

{
  "time": "2024-11-03T12:27:10.472468147+11:00",
  "parent": "77187405a9a575aa8dc054d85fff871c24ca8b9e257886fd676b887fdf9c5b18",
  "tree": "690ef13c98b1aede49a0e39fc2881d210bfc99e9637e1dcc1425714e2c357ec5",
  "paths": [
    "/mnt/FF67-F77E/files"
  ],
  "hostname": "gearbox",
  "username": "chris",
  "uid": 1000,
  "gid": 1000,
  "program_version": "restic 0.16.4"
}

cat blob

(same as snapshot 2 - its the same tree)

When the full, unexpected backup, runs against in snapshot 2, the tree has changed (b2b90e2c63626ac41c5ece049356193ab0ab54a8ca12a0e9d322c7d5aea5f790 to 690ef13c98b1aede49a0e39fc2881d210bfc99e9637e1dcc1425714e2c357ec5).

Whereas, the expected backup, where it completes quick, the tree does not change (remains as 690ef13c98b1aede49a0e39fc2881d210bfc99e9637e1dcc1425714e2c357ec5 between snapshot 2 and 3)

So something in the reboot is causing the root tree in the snapshot to change, and I’m assuming this has a trickle down effect triggering a change in every subtree. Comparing the two blob’s they do appear identical though (same device id, user, group, times, etc), so I’m still figuring out what exactly is changing

kapitainsky · November 3, 2024, 6:51am

I think that device ID of the mounted disk changes every time you mount it hence restic has to rescan everything + probably saves all “new” metadata.

AFAIK restic does not allow to ignore it (yet) but you can try rustic which has ignore-devid option. I use it exactly for this reason (+ few more irrelevant for this case).

tjh · November 3, 2024, 8:35pm

Nice detective work. Obviously the solution here is to never reboot

fede · November 3, 2024, 8:49pm

Are you using a temporary file cleaner like BleachBit?
I remember that I once had a similar experience with another backup program, Cobian Backup, because BleachIt erased the backup attribute of the files.

MichaelEischer · November 6, 2024, 9:16pm

The subtree in both blobs is different. This means that something in there has changed. By design, if a tree blob for a folder changes, then the tree blobs of every parent folder will change too.

Please take a look at the subtree using restic cat tree <snapshotId>:/mnt (tree makes it easier to navigate through the tree blobs of a snapshot).

MichaelEischer · November 6, 2024, 9:20pm

@kapitainsky As I’ve already said several times, the device ID is not and was never used by the change detection of the backup command. A changing device ID only results in lots of new tree blobs, but no rescan of all files.

cjke · November 10, 2024, 12:21pm

Thanks Michael, between your comment and the notes from tjh I was able to track down the device id was changing every reboot. The exact “why” as to why this is happening, I am less sure and I couldn’t narrow it down, but I suspect it might have been the format.

I started from scratch, completely wiped both disks, formatted as ext and now it works perfectly. The second run is a few seconds at most.

Thanks for all the help. As a bonus, I feel like I have a much deeper understanding on how restic structures its data

wie-was · November 13, 2024, 11:37am

I was following this thread since it appears that I have the exact same problem. On most (but not all) backups restic rescans almost all files and states in cat snapshot that allmost all files changed.
I did not dig in deep enough that it felt okay to open a thread of my own.
Concrete question: I’m not sure if I understand if formatting the disks as ext, as you describe it, has something to do with the solution?
The only sightly weird fact in my own setup is that the source disk (external) is formatted in fat32. I might do a test setup myself to see if that could be the problem.

kapitainsky · November 13, 2024, 1:44pm

Maybe fat32 modification times resolution is not sufficient for restic to decide that files are identical or not. IMO such ancient filesystem should not be used for anything. It is extremely primitive and errors prone.

wie-was · November 13, 2024, 2:27pm

Thanks for your input. Experimentation will tell, I guess I specifically formatted the disk to fat32 because Volumio would not detect an ext4 disk, while fat32 works