Single filesystem + local repo versus --no-cache

$ restic version
restic 0.17.0 compiled with go1.22.5 on linux/arm64

I must be missing something obvious here, but with the recent questions on this forum about the local cache, I wonder if it is preferable to use --no-cache when applicable if I’m working with a local repository that shares the filesystem with the cache folder.

Are there any benefits of using the local cache anyway in this situation?

Thank you.

You’re absolutely right. If the cache is on the same disk as your repo, it has no benefit. The whole idea of caching is to keep stuff close when you’re dealing with remote repos, to cope with latency etc…

Do use --no-cache flag and you should actually expect things to go little faster (with no unneeded copying between repo and cache on the same disk).

1 Like

An old thread - I know. Using restic 0.18.0.

When using –no-cache for access to a local repo, it would be nice if there was a way to suppress

warning: running prune without a cache, this may be very slow!

How can restic be told to not output that warning? I found no option for this. –quiet does not help.

How could restic determine whether a cache helps or not? There may be fast and slow local storage.

restic could do some short statistics reading cache against reading repo. It could then recommend enabling or disabling the cache - but not with –quiet

Or it might just not use cache if it does not help with speed. If that works reliably, –no-cache would only be needed when running out of disk space.

Too many variables. Only way to determine it is testing:)

Overall not sure what your suggestions would improve. in 99% of cases it is advisable to use cache. Only situation when it won’t improve much IMO is SSD stored backup.

what is the advantage if cache and repo have the same speed?

exactly this is what caches are for:) Now give me an example when “cache and repo have the same speed?“ for restic? IMO it is only when both (system and repo data) disks are SSD really… which realistically is very rare case at least. Even HDD/HDD scenario would benefit - unless cache HDD is busy with other things. And this is why implementing what you suggested might be trying to solve problem which does not exist in real life.

restic has –no-cache flag for people who for whatever reasons do not want to use cache and I think it is enough development time spent on this subject. Anything else (like automated testing cache vs no cache speeds) is interesting but totally unimportant compared to other areas where work is still in progress. This is my penny worth thought on this subject:)

Let’s just assume both use the same technology - whatever: slow hdd or fast ssd. And both are not too busy. Why is reading the copy in cache faster than reading the original?

Above, zcalusic wrote for exactly this situation that there is no benefit. Assuming he is right, I want –no-cache for saving disk space. But I cannot get rid of that warning unless I filter restic output.

For this warning I agree with you. --no-cache is rather advanced flag and there is no really need to try to babysit users here. Still it is only cosmetic thing.

How can restic be told to not output that warning? I found no option for this. –quiet does not help.

If quiet doesn’t do it, it may not be possible to suppress the warning. That said, it is a warning, not an error. Think of it as restic alerting you that “the options you’ve supplied may cause performance problems”, which you can then ignore as you are a power user and are aware of the consequences of passing --no-cache given your specific setup (cache and repository are on the same physical device).

How could restic determine whether a cache helps or not? There may be fast and slow local storage.

restic could do some short statistics reading cache against reading repo. It could then recommend enabling or disabling the cache - but not with –quiet

Or it might just not use cache if it does not help with speed. If that works reliably, –no-cache would only be needed when running out of disk space.

As you said, the only way restic could “know” whether the cache is useful would be by testing the speed of the cache. Restic would either need to do this before every command (which would introduce a delay, and a bunch of unnecessary reads/writes to the cache device); or run this once and store the results locally, while also reliably detecting when the cache device has changed so that the speed test can be re-run.

This sounds like a lot of additional code to handle one specific repository type: local, and both of these solutions have drawbacks.

  • Re-testing the cache before each operation I’d argue is completely unworkable, the performance trade-off would be so large as to outweigh the benefit.
  • Testing and recording the speed of the cache sounds more workable, but it means additional code complexity added, with again the only benefit being to slightly reduce the disk space used when using the local back-end.

Why do I say “slightly”? I have a ~300Gb repository, stored on REST Server. The local restic cache directory for this repository is ~215Mb.
At least based on that sample size of one, if you’re storing both the cache and the repository on the same device, the cache directory seems like it is basically a rounding error.

Edit: after some forum searching, other topics suggest cache sizes may vary wildly depending on the workload, so YMMV when it comes to cache size. I would hazard a guess that the topics opened regarding large cache sizes are outliers though (because people don’t open topics to report that the cache is small and working as intended!).

1 Like

As you’ve already mentioned it can vary easily. I’ve got 40GB repo and corresponding local cache is 5.5GB.

I would be careful to draw any general conclusions based on one example.

From my observations cache size strongly depends on tree size of given repository. To simplify - many small files backup will require larger cache than the same size few larger files one.

1 Like

Indeed, it might be that I’m actually the outlier here in having such a small cache! Some of the cache sizes listed in the “help my cache is too large” topics I came across while searching were as much as half the dataset in size.

Such large caches are the outliers. The expected cache size is at a few percent of the overall repository. Anything above 10%-20% is likely already in the unusual range.

1 Like