Strategies for one-off back-filling with large amounts of data

Yes, this is correct. Any operations that require an exclusive lock on the repository won’t be able to get an exclusive lock while the long backup is ongoing. You might find this list of operations mapped to lock types helpful: List of lock rules in documentation - #2 by cdhowie

Doing this would split the data ingest down into smaller chunks, but how useful that is/isn’t depends on how long it takes your system to do the full data ingest.
The biggest benefit I can think of would be if you were to use restic rewrite to give each new snapshot the correct “created time” (Working with repositories — restic 0.18.0 documentation). restic forget could then expire the older rsnapshot snapshots on your regular schedule. However this would be a fair amount of manual work, so I’m not sure if it is worth the hassle.

I think this is possibly overcomplicating things. Personally I’d simply not do forget/prune operations until the ingest from rsnapshot is finished. Anecdotally, I run forget+prune about once a month personally for my “on-site” repo, so it wouldn’t be a great hardship for me to suspend the scheduled forget job while a long-running backup job was going. You also probably don’t have much to forget yet, as it sounds like you just started taking new restic backups :slight_smile:

2 Likes