Rclone repository

liv913 · January 30, 2024, 12:03pm

Hi all, I’m new with restic and I’m trying to understand how to design my proper home backup solution.
I have around 3 GB of data that I would like to backup with restic on a Google Drive / rclone repository. I don’t know if I can just create a single repository based on rclone and directly point backups on it, or if it would be better to perform backup to a local repository and then replicate it to the remote rclone repository.
What would you suggest me?

Thanks in advance.

kapitainsky · January 30, 2024, 12:24pm

Less is better here. Avoid unnecessary steps.

IMO this is the best approach.

sc2maha · January 30, 2024, 2:41pm

I do the 2-step method; my data is about the same size as yours so a local copy is not a “space” problem.

Here are the advantages, for me. YMMV : )

I can take backups even when I don’t have good connectivity (or any at all – e.g., while traveling etc.)
I get to do a check --read-data far more often than I would dare to with a non-local repo.
- Ditto for restore tests (I use restic dump to produce a tar file and diff that against what’s on disk. Since tar is a tried-and-trusted format, I consider that to be sufficient to assure that I can actually restore data when needed!)
- Ditto for forget/prune. rclone sync-ing the end result is definitely much faster than having restic try to prune+forget a remote repo
rc3one is pretty stable and I’m very confident there will be no transmission errors, so if I restic check locally it is guaranteed to be correct even on the cloud

Finally, one big reason I moved from borg was to be able to do this. Borg warns against this as a security issue due to the way it is designed I suppose, but I was never very happy with that. I jumped as soon as restic had compression for just over a year : )

ProactiveServices · January 30, 2024, 2:42pm

I’d usually agree with kapitainsky, but backing up to Google Drive means your backup is only as secure (from others) as your Google Account, and is only as available to you as your account is. Having a local copy would benefit if you lost the credentials to your Google account, if Google lost your data or closed your account. Since it’s a relatively small amount of data I presume it shouldn’t be a problem to have a local backup.

I’d run the local backup and remote backup as separate restic runs and have them as separate (although similar) repositories. Once you’ve automated one restic run, automating another isn’t too cumbersome

kapitainsky · January 30, 2024, 2:45pm

this is different story:) And I agree. Two backups (not replicas) to two different location is always better.

By “less is better” I only meant that in order to create gdrive repo it is not good idea to use some intermediary steps (local repo and then sync).

sc2maha · February 1, 2024, 2:04pm

can you be more specific what the disadvantages are? (You may or may not have seen my reply in this thread; I do exactly that and it has some advantages for me).

IOW, what am I missing?

kapitainsky · February 1, 2024, 2:08pm

The biggest problem is that if your “local” repo becomes corrupted you will sync this corruption to your online one.

Then it is simple math:)

Very simplified example - lets say that chances of data corruption when restoring is 1/10 for both local and remote repo.

When you run two independent backups chances that you can’t restore (both fail) are 1/100.

When you follow your setup than this is more than 1/10 - 1000% more likely total failure situation.

Then you have moments when only one repo is valid - during your rclone sync remote repo is in mess state until rclone finishes:) Now if your local repo dies at this moment you are left without any valid backup.

Of course real chances of events like this are small. Many people make through life without any backup and do not lose any data. But when you decide to do something I think it makes sense to do this well:)

fede · February 1, 2024, 10:57pm

Currently I do that, a backup to an external drive and another to the cloud. Thanks to restic and this forum.

sc2maha · February 2, 2024, 5:35am

Thanks. I was worried there was something restic-specific that I was not even aware of, like borg’s security warnings when you put the same repo in 2 places (considering that is a big reason I moved to restic!)

My scripts always check --read-data before rclone sync. And rclone is pretty fastidious about making sure things go across correctly.

Disk corruption between one check and the next check (e.g., 1/2 to 1 day later) can happen, but it’s hard to imagine it gets corrupted between the successful check, and the next few seconds when rclone sync runs.

The difference in bandwidth and latency is not a lot for backup but for forget+prune it can be quite a lot.

kapitainsky · February 2, 2024, 10:14am

I can see that you took all possible measures to minimise any risk.

For such things I am always pessimist and trust in Murphy’s law:)

100% correct.

Overall I would not worry too much. Not everything can be perfect. You do already much more than most:)

sc2maha · February 3, 2024, 12:29pm

Me too!

I don’t want to offend but software causing problems is as likely as (or more likely than) a hard disk error. My choice of running a check --read-data before every rclone, and also a restore test using dump | tar -d once in a while, is prompted by the same pessimism and fear of Murphy on the software side. This is after all an immensely complex data structure, unlike say a simple tar file.

Given that I want to do those tests at regular intervals (the check --read-data runs at least once a day), the thought of doing that with the copy on the cloud is… not pleasant : (

I should also mention that while the local disk backup is automated, and the rclone-to-cloud is semi-automated, there is also a fully manual backup to two external HDDs (manual because they are not permanently connected) (oh and they’re different brands too… just in case)