Mount on large (60TB, 4M+ files) B2 repo immediately crashes/is killed from running out of memory

Hello,

We have a 60TB+ Backblaze B2 repo with over 4 million files.
The repository is backed up nightly (so a new snapshot every day).

While it’s still possible to restore individual files/specific directories using something like

restic restore --target /data/one/two/three/file.ext

There are situations where we need to restore files which have been deleted from the backup source disk, so are not in the latest snapshot, but should be in one of the past snapshots. However, there doesn’t seem to be a simple way to find the “latest snapshot where /data/one/two/three/file.ext exists” without using something like restic find and sorting + filtering (and restic find takes a very long time on this repository to begin with).

A recommendation I saw was to mount the repo and search through the file system (though I imagine restic find is already doing something effectively identical with the index).

However, when trying to mount the repo (on a 12-core machine with 60GB of RAM), all CPU cores shoot to 100% usage, RAM is maxed out, and restic process is eventually killed, as seen in dmesg:

[1919588.321077] Out of memory: Killed process 4110640 (restic_0.15.1_l) total-vm:54253724kB, anon-rss:53339504kB, file-rss:0kB, shmem
-rss:0kB, UID:1001 pgtables:104860kB oom_score_adj:0
[1919591.927752] oom_reaper: reaped process 4110640 (restic_0.15.1_l), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Is our repository too big (either in size or in number of files) to safely mount, even on a (relatively) powerful machine? Is there a better pattern to use than what I’ve mentioned in this post?

Thanks

I don’t have experience with that repo size, but running restic with GOGC=20 env var often helps with memory in other cases. But whether it helps enough in yours I can’t say, why not try it though.

1 Like

That does seem to allow mount to (just barely) finish without completely crashing, but it does (as expected) take quite a while to finish.

I suppose I’ll have to benchmark this method against restic find and see which is the safest/quickest to use.

Thanks for the tip

How large is the index folder of the repository? I’m somewhat surprised by the memory usage, I’d expect restic to require about 20GB of RAM, not 50GB…

That said, I’d strongly recommend to keep a repository within 100TB / 100 million files (whatever limit is reached first). restic so far isn’t optimized for anything larger than that.