Different file counts with stats and ls?

What might cause substantially different file counts between restic ls and restic stats?

In one of my snapshots, ls is reporting 559k files, while stats is reporting only 537k files

# restic -r . ls 82de2fa1 | wc -l
559402
# restic -r . stats 82de2fa1 --mode restore-size
repository be51bcfb opened successfully, password is correct
scanning...
Stats for 82de2fa1 in restore-size mode:
Total File Count:   537170
Total Size:   13.968 GiB

restic ls includes directories in the output, while I suspect that restic stats only counts files. (I don’t know if restic stats counts other nodes like symlinks. That could be another possible reason the counts might differ.)

Try running restic ls but excluding directories (and tail -n +2 to exclude the initial summary line):

restic -r . ls -l 82de2fa1 | tail -n +2 | grep -v ^d | wc -l

Yeah, I considered that as well, but it doesn’t seem to be the case. I did some experiments on a very small repository (with 10 directories and 100 files), and I got exactly the same results from ls and stats - both files and directories were included the results from each command, and their outputs matched.

I also tested symlinks and hard links. No change.

On this repository, here are the results for ls without directories.

# restic -r . ls -l 82de2fa1 | tail -n +2 | grep -v ^d | wc -l
477857

@David Sorry if this becomes me asking you to do some work, but I’m thinking “Can this be reproduced with a small repo?”. If one adds little data at a time, one should be able to spot when the two commands start producing different statistics.

I’m happy to try. Are you thinking this is a bug, and trying to pinpoint the root cause?

Just trying to isolate it, and I’m thinking 1) it should be easier with a small test repo, and 2) while making a small test repo one should be able to spot it happening.

OK, if we’re thinking this is likely a bug, I’ll try to isolate. Instead of working to build up a small repo, I think it might be easier to pinpoint by instrumenting the code to determine which files exactly are being included in stats and ls as it examines the big repo.

I’ll give it a go and let you know what I come up with.

I’ve made enough progress to open Issue #2537. This is a bug.

There seem to be multiple problems that are causing stat to ignore some files that ls is including.

  • The first problem is that ls is counting empty directories, but stat is not.
  • The second problem is related to files that have hard links. I’m still looking into this.
1 Like