Multiple runs for source vs source list

g33kphr33k · December 14, 2023, 9:12am

I’ve been using Restic for a few years now, both personal and professionally. I’ve hit an interesting one where I’m changing a mirrored NAS at two sites in to an actual backup using Restic. The NAS server contains TBs of video. I’ve opted for SFTP and although slow for the “Load index files” phase, the backup seems to be running fine.

Due to it being multiple folders with TBs of video in each, I’ve so far opted to run Restic using one top level folder at a time. Think of it as VIDEOS/A-C, then VIDEOS/D-F, etc. I am planning on using this in a script once it completes but I’m still trundling through.

Are there any benefits from doing multiple runs of Restic at the Alphabet breakdown level vs just letting it start at VIDEOS/ and doing it all? We’re talking 258TB of video in total whereas the lower folders run in anything from 5TB to 40TB chunks.

I’m on day 5 of doing this so I really hope I don’t have to restart

nicnab · December 14, 2023, 9:26am

I don’t see a technical advantage in using either but rather differences in organisation. If you have multiple snapshots, you’ll have to plan your retention accordingly (forgetting and pruning). And if you ever change your folder structure, you’ll have to remember to adapt your scripts.

That said there are a bunch of threads about using restic with very large datasets that might be interesting for you.

MichaelEischer · December 29, 2023, 4:41pm

Make sure to keep an eye on the memory usage of restic. At 250TB videos you’ll likely need 30-50GB of RAM if all videos are stored in the same repository. It might be a good idea to use a separate repository for each toplevel directory. That will somewhat complicate the repository management, but is much faster.