Restic+rclone vs rclone+rclone crypt+rclone chunker

clio · January 24, 2024, 4:25pm

Hi all!
I have a NAS at home and want to setup an off-site cloud cold storage backup.
The data I want to backup is fairly static. FIles will be edited, added, or removed rarely.

Now, I was wondering if I should use restic+rclone or rclone only using its crypt and chunker features to encrypt and chunk the data (so that file size and folder structure does not leak)?

If I would go with restic+rclone, as I understand, each time I run restic backup, I would create a new snapshot in the repo. How does that affect storage size? I.e., if I would have a file with small changes with every snapshot, is the entire file stored again and again or is this done by storing deltas?

Thanks!

tjh · January 24, 2024, 5:18pm

Do you want to be able to salvage a file you deleted 2 months ago, but only reaslied today is gone?
Then restic with its snapshots are great, vs rclone that’s just going to mirror the current state all the time.

If your data being backed up hasn’t changed at all, then the repo will grow in size by a tiny amount when a new snapshot(backup) is taken to record all the metadata.

As to which to chose - only you can answer that.

shd2h · January 24, 2024, 11:10pm

Hey, welcome to the forum!

restic breaks the file into individual chunks of data, and only the changed chunks would be added to the repository with each new snapshot.

So it doesn’t work quite like deltas or incremental backups; the closest “traditional backup type” to what restic does would be a “synthetic full” backup in my opinion.

If you’re interested in some technical details of what exactly restic is doing, there’s a detailed restic blog post that goes over content defined chunking, and how restic uses it for de-duplication between snapshots as well as individual files

clio · January 25, 2024, 4:47am

Thanks!

One of my concerns is that I have limited cloud storage. How much overhead does occur using restic?

kapitainsky · January 25, 2024, 7:22am

Thanks to very effective and fast compression algorithm used by resting (zstd) your repo size usually will be smaller than original data changes.

Of course at the end you have to decide what you want to use but if you want backup use restic. If you want to mirror your local data to some cloud use rclone

shd2h · January 25, 2024, 6:49pm

How much overhead does occur using restic?

It is difficult to estimate the complete size of a single snapshot (or backup) in advance, because it depends on how compressible/deduplicateable (is that second one a word?) the data in it is.

For example, text docs compress and deduplicate against each other really well, videos/pictures not as well. In my experience though, I’ve never seen a dataset that when backed up by restic is larger than it is at the source.

While there is some metadata overhead, it is small in relation to the data and is offset by the space reduction savings introduced by the compression/deduplication.

The only way to really know how much space you’ll need for your dataset is to do some testing yourself.
Fortunately you don’t have to commit to storing anything in the cloud to do this; restic backup has a --dry-run switch that will go through the backup process, but won’t actually write anything to the repository.
The restic manual includes details about this switch.
Once you know the size of your initial snapshot, you also need to estimate how much changed data you’ll be adding to the repository with each new snapshot, and for how long you’ll keep individual snapshots before expiring them. With these numbers, you should be able to roughly predict how much space you will require long-term.

kapitainsky hit the nail on the head though - restic is a backup solution. If you want versioned backups, it’s a good fit for that use case. If you want an off-site mirror of your local data, and don’t care about versioning, rclone is a better fit.