Parent snapshot usage questions

AimoE · October 20, 2021, 6:45am

I am considering to use restic, and have questions regarding the concept of “parent snapshots”, which to me seems like a method to identify separate backup sets within a repository.

Scenario 1: same host and username but multiple OS installations

When I have multiple operating systems on the same host and use the same username in each, nearly all of the paths and user specific files are identical. Is there any way to help restic distinguish between the OS installations when it considers the “parent snapshot”? Having separate repositories for each OS would be waste of space.

In practice, though, running backups in parallel OS’es on the same host only occurs when I am still using an older version of Ubuntu while preparing to deploy a new Ubuntu version that I have installed on the same disk. In this scenario, the period of needing backups of the both is fairly short, so in principle I can refrain from making any backups in the new OS until the final switch, but I would like to know if there is a practical method to set up backups in the new OS already before making the switch.

Scenario 2: how to organize initial backups

Let’s say I need to regularly back up locations such as /home/$USER/ and /data/$USER/, but for initial deployment of restic, I don’t want to back up all data at once; I want to split it up into smaller chunks. Based on what I have learned from the documentation and forum discussions, it seems like I should not begin by making initial backups of lots of sub-directories, but instead, I should make initial backups of the root directories while reducing the amount of data in the initial backup by using lots of exclusions – and then run subsequent backups with less and less of exclusions until I have trimmed the exclusions down to the final set I need permanently. Is this what you would recommend?

alexweiss · October 20, 2021, 9:12am

Parent snapshots are only used to speed up backups by reducing the number of files that need to be read, chunked and hashed.

Generally, in order to identify separate backup sets, you should use tags.

Your scenario 1: Either use separate host names for your separate OS instances running on the same PC (--host) or use suitable tags (--tag).

Your scenario 2: There are several posibilites to run large initial backups in multiple smaller backup steps:

Just run the initial backup and cancel/restart it when appropriate. (restic will re-use most already-saved blobs)
run backups on subpaths and finally a backup on all paths, followed by removing the first snapshots
use a large exclude list and shorten it step by step

All work and can be recommended IMO. Because of the way parent snapshots work, the latter choices can be faster.