Rsync.net - how do you run repo maintanenance?

MorgothSauron · June 18, 2020, 8:37pm

I’ve been using AWS as one of the destination for my restic backup. When I started to use borg in parallel I started to use rsync.net instead.

I like the simplicity of rsync.net and I’d like to move my restic repository there. However, unlike borg, the servers don’t have the restic binary.

This raise the question: how can I check the repository efficiently ? Without restic on the server I cannot run tasks like ‘check --read-data’ directly on their server and save bandwidth usage.

How do you handle this situation ? Run check on a subset of the data ?

alexweiss · June 19, 2020, 7:27pm

Maybe they are willing to make sha256sum available as ssh command? Then checking the sha256sum against the file name for each file would be a good alternative to check --read-data (of course combined by a local run of check without --read-data)

MorgothSauron · June 20, 2020, 9:45am

Well I cannot really run check --read-data locally. I do have local repositories for the same backup, but obviously they are different repositories. The local one could be fine while the remote wouldn’t. I do not sync repositories: I run a backup for each destinations I have (sftp, smb, ssd and s3).

I did try to contact the support to make restic available on their servers for the purpose of running a check --read-data locally on the servers, but I never got any feedback. I can try to ask for sha256sum.

alexweiss · June 20, 2020, 10:10am

If you can check the sha256 checksum of each file (config excluded, of course) in your repository, IMHO a local run of check without --read-data suffices. This check run should be pretty fast and does not need much data to download from the repository.

I would even go further and claim that a local run of check --with-cache would be enough if you already checked the checksums of the files in /data/, /index/ and /snapshots/.

MorgothSauron · June 20, 2020, 10:24am

Ok, I’ll give it a try.

alexweiss · June 20, 2020, 10:39am

Now having told you that besides local restic check runs (which will check repository consistency) it is enough to check that the repository files did not accidentially change, you have of course another option:

check if the process to transfer/store the data guarantees data consistency (should be usually the case)
make sure you read and understand how your storage provider handles your data and what guarantees he gives (if you are unsure or not satisfied you should anyhow change your storage provider)
Then it is basically the job of your provider to regularly check your contents for data corruption and to initiate counter-measures
That is, instead of testing all files, testing random samples should give you enough confidence
EDIT:

To summarize:
If you do not trust your storage provider, choose another one. If you do trust your storage provider and want to implement control mechanisms, I would suggest to:

run restic snapshots regularly and make sure the wanted snapshots are in your repo!
run restic check regularly (will already check many files)
test the sha256 of some random samples in /data/
regularly run restore tests (these also do test the data in your repository, but also test a lot of other gotchas )

Actually I think testing only random samples could be a feature to integrate into restic check - I’ll open an issue.

alexweiss · June 20, 2020, 5:15pm

I made a PR to add this feature:

cdhowie · June 21, 2020, 7:06pm

What’s the benefit of this over the existing --read-data-subset directive?

MorgothSauron · June 22, 2020, 8:52am

This is why I was asking. I always get some kind of timeout when I backup to S3. I want to switch to a provider that offer standard access method, like SSH with rsync.net. I’m really happy with rsync.net for my Borg repo. But unlike restic, I can run borg on the server to check the repository to speed up the process.

alexweiss · June 22, 2020, 6:16pm

read-data-subset will always read the same files for a given value n/m (to be more precise, it chooses a list of the first byte of the sha256 hash which equals to the two first characters in the filename).

To get a probability answer to “how probable is it that I do have a corrupted pack even though I checked n packs” you should use a random subset of your packs to test. There it also makes much more sense to define the the sample size instead of testing a fixed percentage of all samples.
The statistics are basically identical to predict election results by questioning just a 1000 or so of all voters

Heap1731 · January 17, 2024, 12:29pm

I’m also looking for restic binary at rsync.net.
My use case is to backup from rsync.net to B2, borg doesn’t do remote repositories directly like restic which makes it non-trivial

ProactiveServices · January 17, 2024, 1:08pm

If you contact them you’ll get a prompt and considered reply, they’re that sort of outfit. They do offer a restic pricing tier which gives you access to the restic binary on their servers.

Heap1731 · January 17, 2024, 1:42pm

I am a paying customer actually, but I don’t see the restic binary on their server. Perhaps if I ask them they will install it for me.
I’ll send them an email and see how they respond

ProactiveServices · January 17, 2024, 2:44pm

I’d be interested to hear the reply.

Heap1731 · January 20, 2024, 12:12am

Their reply:

Sorry, we cannot install individual binaries for customers.

stevesbrain · February 2, 2024, 5:34am

You won’t get the restic pricing, but if you sign up for a ZFS account, you get given your own VM. You could then install restic in that.