While debugging slow backups I noticed that restic calls lstat on files excluded by extension.
I have about a million protobuf (.pb) files which I exclude using *.pb. Even so restic scans all the .pb files using lstat, which slows down my backups tremendously. Is this the intended behaviour?
Here is an example where it scanned about a million files in 54 minutes (the scan time varies a bit with the load):
scan finished in 3243.814s: 1055943 files, 139.734 GiB
I’ve since then added a bunch more to my exclude file which cuts the scan time down quite a bit. I think the main culprit is a lot of protobuf files which I ignore(d) using *.pb. This yields scan times of ~ 30 min (including the other excludes I’ve added):
scan finished in 1789.684s: 756149 files, 93.377 GiB
If I instead exclude the path that contains the .pb-files I get a scan time of around 5 minutes:
scan finished in 286.896s: 757815 files, 93.381 GiB
So avoiding filename wildcard excludes (and extending my exclude file in general) seems to have fixed the issue for now.
This is an architectural limitation restic currently has: some of the exclude functions (e.g. --one-file-system) need to have the information lstat returns, and so we’re currently running lstat on the files before checking the excludes. The solution would be to have two types of excludes:
The ones which can decide by name/path if a file is to be excluded
The ones which need the lstat information (e.g. --one-file-system)
Until this is implemented, restic runs lstat() on all files. Sorry about that. You could add a bug report to the GitHub issue tracker so we can track improving the exclude function, that’d help.