Copy repo with rsync: should I lock before? Use specific order?

Reading this great forum, I notice several people create their second backup (3-2-1 rule) with rsync.

I would like to do that as well but wonder:

  1. Should I stop all write restic operations (backup and prune) while I am running rsync?

  2. Should I run rsync in this order :
    2a) config file
    2b) keys dir
    2c) snapshots dir
    2d) index dir
    2e) data dir
    2f) ignore locks dir

(If I understood the docs correctly). Or does it not matter, and I can run rsync “blindly" on the whole repo?

Thanks!

I guess it’s hard to say it never matters but I don’t see a scenario where the rsynced repo is more broken than the original. If you add data while syncing, you might end up with unreferenced chunks. Same with pruning during sync. The only thing I have no idea about: what if things like index, snapshots etc. are inconsistent?

I usually have a central script that will ssh into the clients, backup one after the other, do pruning and then rsync the result away when everything else is done. Additionally, you can make sure the synced repo is good by checking it, ideally with some form of -–read-data . That’s always a good idea anyway.

1 Like

Out of curiosity, why choose rsync? I personally implement the 3-2-1 strategy using restic copy. It’s very efficient: I perform daily local backups and run a restic copy sync every three days, including cleanup and verification steps.

3 Likes

Okay I admit it: I’m deeply in love with rsync. Only half kidding… I have to admit I haven’t tried restic copy so far and generally use rsync a lot so it’s a natural choice for me. One advantage might be that, using rsync, you can choose whether you want to push or pull the files and it’s fairly easy to handle a 1:1 sync. Also, I use bandwidth throttling when going over a DSL line (Germany here… slow lines everywhere) which is very easy with rsync as well.

3 Likes

You will like this thread, especially this answer from cdhowie :slight_smile:

3 Likes

Ah yes, I didn’t find this thread in my research before my question, but this answers it perfectly!

Thanks for pointing it out to me, it’s actually the write order that matters, not the read order. Since a copy is actually both, it’s hard to know from just reading the documentation.

The documentation explains very well the order of operation for read and write. It might be useful to add a little paragraph in the documentation for the order of operation for copy.

Very interesting, thanks for pointing this out! Just to be sure I understand correctly: this holds true if you want to continue to add snapshots oder prune the cloned repos. If you “only” clone the repo so you have something to restore from, I don’t see how these problems would affect such a cloned repo. Correct?

I think a problem if you only rsync the whole repo without enforcing this specific order, if your rsync does not complete, you are likely to end up in a corrupted state. For example, you will have a snapshot/index that refer to inexistent data. Restic will not like that, and if you run an operation on such a repo, that might corrupt it further.

Whereas if you respect the prescribed write order, you are guaranteed to have a repo in a viable state at all times.

Said another way, it’s ok to have data unreferenced in snapshot/indexes but not the other way round.

I’m completely new to restic though, so my understanding should be taken with a large pinch of salt.