Backing up (macOS) Photos Library

Hello.

I’ve started backing up my Photos.app library with restic to Backblaze. I’ve a separate restic setup that backs my computer up. However, my Photos.app library is stored separately on an external disk(formatted as macOS Extended - journaled) and has it’s own restic setup with again, it’s own Backblaze repository.

The Photos.app library is an iCloud library and in it’s preferences, I enabled it to download the originals so the entire library is on disk and takes up a storage space of roughly 150 GBs.

I successfully backed it up with the command restic backup ../Photos\ Library.photoslibrary/ --exclude-caches \. However, when I tried to backup afterwards to reflect on the meanwhile changes, it started backing up from the beginnning of the roughly 150 GB, and not just account for the changes in between.

Why is that so? Eventually, the Photos.app library is a file/folder structure but does restic fall short in dealing with it? Or am I missing something here.

Thanks.

What does “backing up from the beginning” mean precisely? Restic has to list all directories to find changed files. If the metadata of a file (like modification time, inode, file size, …) changes then restic will read the whole file again but still only uploads the changed parts.

Does backup -vv ... report the photos as changed? With the verbose flags restic should also print a line like using parent snapshot c9057ddb shortly after starting the backup.

2 Likes

My apologies. I’ve forgotten about this issue of mine with all the other stuff I had in my plate as this backup of mine isn’t something I have to do every other day.

Anyways, first off, what does backing up from the beginning mean?

As I stated in the original question, my photo library itself sums up to roughly 150 GBs. With every backup, the progress indicator in restic always counted the amount to be backed up as that roughly 150 GBs

I was baffled by this because when I perform my other restic implementations, it went through everything in the local file system(as mentioned by @MichaelEischer) and reflected only the amount to be backed up.

So, for instance, my MBP(which has a separate restic implementation) itself has about 30 GBs of data on it and everytime I back it up, only the changed amount gets reflected in restic’s progress bar, e.g. if only 4 GB of change happened since the last backup, I only see that 4 GB in restic’s progress indicator.

After performing the backup on my photos library with the -vv option, I got the following statement.

I guess, just as @MichaelEischer stated it, restic is still backing up the changed portion. I should’ve realised it without the -vv option since the backups after the initial backup finished much faster.

However, as a technical question, even if restic only backs up the changed portion, why do I still see the total amount of my library in the progress bar? I mean, is it just a metadata change or something else? Just curious.

Thanks.

The current progress and the totals in the progress bar count all files no matter whether these are changed or not. That is the expected outcome is that the progress bar shows roughly the same values for the current progress (left part) and the totals, so for your photos backup it should show 148GB twice. Unmodified files are just added to the backup progress without any further processing. Files that are regarded as modified but whose content are not actually modified are just reread by restic without uploading any new data to the backup repository. The file names of “modified” files are shown below the status bar, but that should be the only difference in the output for modified vs. unmodified files (the backup summary is also different).

Could you run stat <path to one of the jpeg reported as modified> twice for the same file? It looks like the iCloud library somehow lets restic think that the picture files were modified. With the right flag for the backup command restic should be able to skip reading unmodified photos.

1 Like

Upon checking again my backup mechanism for my machine itself, yeah, that is the expected behaviour. I must have missed it.

Based on what you said, I’ve ran the stat command on one of the files for one of the images in the screenshot I’ve shared before which gave the following output:

16777231 1971776 -rw------- 1 can staff 0 26369019 "May  2 21:59:35 2020" "Apr  1 13:25:08 2020" "Apr 29 10:02:22 2020" "Apr  1 13:25:08 2020" 4096 51504 0 Photos Library.photoslibrary/originals/6/65D2225F-73E7-4839-AE2D-756CEC96FD8D.cr2

Then I did yet another backup. Afterwards, run the command stat on the same file and this time, the output was the the file could not be found.

I guess, like you’ve said, iCloud library makes changes that make restic think that the files are modified even if they’re not. That was my idea from the beginning as well. Now it’s more vivid.

What matters is not how much restic counts up when scanning files, but the amount of “Added to repo” that it reports in the end, e.g.:

Files:           5 new,     6 changed, 595119 unmodified
Dirs:            0 new,     3 changed,     0 unmodified
Added to the repo: 105.346 KiB

processed 595130 files, 113.866 GiB in 4:25
snapshot 4c785985 saved

In the above example, restic scanned 113 GiB of data (which is the same type of number you see when you look at the progress bar counting up GiBs), but it only actually backed up (due to changes in the files) 105 KiB. That last number is what you should look at.

1 Like

Hmm that’s strange. I’d expect that the files usually still exist, as otherwise the screenshot wouldn’t have shown most files as modified. Can you try the stat command for a few other files?

Just did on two other files from the screenshot above:

  • performed the stat command for two separate files,
  • it returned some output. Stating information about the files,
  • ran restic to backup again,
  • performed the stat command on those two files,
  • this time, no output, files can’t be found.

I assume it’s something we can’t manage and like we’ve discussed, iCloud does some mumbo jumbo behind-the-scenes.

Maybe it works if you try to access the parent directories step by step? But yes, I agree that this it not the expected behavior for a “filesystem”.