I’m in search of the elusive perfect backup solution. Did I find it? Can you all tell me if this is possible?
- Can restic do file level dedup or is it block level only?
- Does restic store the file/block hash locally?
- If the hash is local, how do backups from other machines benefit by deduplication? Does it have to read every file/block in the backup to determine hashes and then dedup?
- If the dedup is file level, can I go to the backup destination and see the actual files and restore a file without using the restic client?
For me a “perfect backup solution” would be cross platform (Win/Linux) where the deduplication is file level with the hash stored on the destination (or maybe a google/amazon database store of some sort). The client would support storing backups in different sub-folders and support date/time base differentials (such that yesterday’s backup would be a sub-folder with the datetime and today’s would have today’s datetime). The destination could be Google Storage Spaces or Amazon S3 or Glacier. And you could do a restore by going to the backup and grabbing the files you need. What this also infers is that there are pointers for deduplicated files so that today’s backup folder would appear to have the full backup but from a storage perspective only be taking up space for new non-duplicated files and have pointers for the rest. And to make it even MORE challenging, I’d like a process that could run every month or so that would first prune old daily backups leaving monthly backups in tact. And I’d like a way to move old monthly backups (> 1 year) to a cheaper storage solution (like Google Coldline). The latter would actually be possible via script so long as all the other requirements are met. Pruning would need to be in the app.
What scenario would drive me to want these features? I host websites for my clients that I do dev work for. I have multiple web and database servers and I know there is a ton of duplication (especially with all the wordpress sites). I want to be able to maintain point-in-time backups using the least amount of space (without a huge overhead) and be able to easily go restore or copy a file or directory without having to go through a client. And due to cost and reliability I want to use Google/AWS file services. I know, this is probably a pipe dream.
So if restic can’t do this, does anyone have any solutions that may? Or is there a way to use something else with restic to do what I’m looking for?