We are considering Restic as a Debian Linux file server backup solution.
The server hosts the users’ home directory and a few TB of raw AVCHD footage. Both require external encrypted backup, ideally on Amazon S3, Amazon Glacier, Google Cloud, OVH Cloud, or any other reliable and cost-effective data storage provider.
The users’ home directory contains the usual file types (office documents, git repositories, source code files, photos, videos). Those files are updated regularly, and we need to keep a snapshot history.
The footage directory tree is different: it is mostly a write-once, read-many directory. Bearing exceptional cases, footage stored there is never updated and rarely deleted. In fact, due to bit rot on the local disk, we very likely need to keep only the initial file’s content in the snapshots.
My question is simple: would Restic be a suitable solution given our use case?
In my opinion: absolutely. restic doesn’t care how often you change your files or if you do not. But when you do and change only a tiny bit of the file, deduplication will kick in and save you a ton of traffic and backup drive space.
Regarding bit rot: because restic uses checksums, you might consider monitoring detected file changes in your footage directory. If you didn’t change a file but it got included in a new snapshot, you basically know that your local storage has bit rot. Search this forum to find quite a few cases where restic “detected” that on users’ hardware.
Hi Nico. Thanks for your reply, it is quite encouraging.
Regarding file checksum and metadata (size, mtime, etc.): are they locally cached, or does this verification imply (potentially costly) traffic from the remote storage service?
They are stored locally. That is what makes restic very fast under normal circumstances. I backup to a network storage via rest-server and backing up my local machine usually takes only a couple of seconds.
This is actually not true. First, “got included in a new snapshot” is not a very precise formulation as all present files get included (i.e. referenced) in a newly created snapshot. I assume you mean “got reported as changed”, right?
However, file change detection is done by using the metadata information (mtime, size,..) of a file. When you have bit rot, the file contents are changed without a change in the metadata. In this case, restic backup will report the file as unchanged and still reference to a version without bit rot in the snapshot.
(BTW: There are stories about users finding bit rot on the hardware they use to store the repository - here restic is able to detect this as it reads the content as well as the additionally stored checksums and can perform the needed checks)
Thanks, @alexweiss for the clarification. I admit I wasn’t aware that it was this way around. But now that you mention it, it really only makes sense this way. Clearly, restic doesn’t read all files in their entirety on every backup run.