Backup family photos and videos

Hi all,

I’m new to restic.

I would like to use it to backup ~500 Go family photos and videos on Windows 10.

My photos are on an HDD.

My storage is an SSD on USB.

I try to test and find that on the second backup it took near the same time than to build the repo for the first time.

As my data are growing every year I would like to accelerate the backup if this is possible. For exemple, my photos are organized by year and I know that the year 2000 is stable instead of the current year. So is there a way to tell restic to focus on known unstable directories only ?

I used a very simple

restic backup –files-from “%BACKUP_LIST%” –repo “%RESTIC_REPOSITORY%”

Hoping restic is the tool I need.

2 Likes

Hi!

Can you show us the complete output from the first and the second backup run?

Unfortunately not, I have no log and did not keep the execution traces.

Okay, not much to look at then :confused: We need more information.

Can you reproduce the problem?

Does the content of %BACKUP_LIST% change between the runs?

Please try setting the --parent option to the latest snapshot’s ID and see if that makes a difference.

Also, please provide the information you were asked for when you wrote your initial post :slight_smile:

1 Like

Hi @PhaustSceptic, restic is definitely capable for your use case and is very fast if used properly. A few things to check:

1. Parent snapshot detection

Restic uses a “parent” snapshot to determine what’s changed. From the restic docs:

By default restic groups snapshots by hostname and backup paths, and then selects the latest snapshot in the group that matches the current backup.

If something is preventing parent detection (different hostname, changed paths), restic rescans everything. Check your snapshots:

restic -r "%RESTIC_REPOSITORY%" snapshots

If you see the previous backup listed, try explicitly setting the parent:

restic -r "%RESTIC_REPOSITORY%" backup --parent SNAPSHOT_ID --files-from "%BACKUP_LIST%"

2. Windows change detection

On Windows, change detection is limited:

On Windows, a file is considered unchanged when its path, size and modification time match.

If your files have changing modification times (some sync tools do this), restic will rescan them even though content is identical.

3. Cache location

Restic uses a local cache to iprove performance. On Windows this is typically %LOCALAPPDATA%\restic. If the cache is being cleared between runs, you lose the performance benefit. You can set a persistent location if you like:

set RESTIC_CACHE_DIR=C:\restic-cache
restic -r "%RESTIC_REPOSITORY%" backup --files-from "%BACKUP_LIST%"

4. Run with verbose output

To see where time is being spent:

restic -r "%RESTIC_REPOSITORY%" backup -v --files-from "%BACKUP_LIST%"

Look for the “using parent snapshot” line. If it’s missing, that’s most likely your problem.

5. Excluding stable directories

For stable year folders, you can use --exclude:

restic -r "%RESTIC_REPOSITORY%" backup --exclude "photos\2000" --exclude "photos\2001" --files-from "%BACKUP_LIST%"

That said, if parent detection is working correctly, restic should skip unchanged files quickly based on metadata alone without needing excludes.

Hope this helps.

1 Like

Actually one common mistake is to use –files-from to supply restic with a list of possible changed files (determined by some external tool or scripting etc..) instead of just giving a path (and maybe excludes) and letting restic do the change detection. This has basically two consequences or misunderstandings:

  • restic uses in this case a different path list for each backup which prevents restic from using a parent snapshot. IF this list really only contains changed files, using no parent is ok, but if you do have unchanged files, this is typically even slower than just using a single path and letting restic do the change detection.
  • Some people think that providing such a list is necessary to do an incremental backup and thus saves space in repository. This is however not true - check out, how the deduplication works in restic. Moreover providing such a list in fact removes information and makes restores much more difficult: You cannot distinguish anymore between files being removed from the list because they have not changed and files being removed from the list because they have been removed from the filesystem. I.e. you do not have a snapshot of the disc’s state anymore and therefore have trouble toe restore such a state - and this is exactly what a backup tool should be for…

TL;DR: Using –files-from is in most cases not necessary, but can be a source of mistakes or misunderstandings!

4 Likes

Thanks !

I will test that !

But is there some best practices ?

For example, is it better for efficiency to have an huge folder to backup in one single repository or split folders into several repositories taking into account that changes affect only some folders (adding new pictures or retouching some not so old pictures).

So my backup strategy consists in targeting only the folder/repository I know there are some changes ?

/2000-2009 → one restic repository (stable, one snapshot)
/2010-2019 → another restic repository (stable one snaphot)
/2020-2029 → yarr (unstable, one snapshot by media import or photo editing)

I was using –file-from for modularity. And yes, with the idea to change its content if I think putting only root folders with known changes can improve efficiency.

If you have huge data sets you might want to split, but otherwise no. If you abandon the (IMO) bad habit of using ‘–files-from’ and instead use a consistent root path, restic has to do very little job to determine if a file shall be backed up: Just compare its metadata with metadata in the cache or repo, which is a very cheap operation.

To sum it up: Don’t overthink and don’t optimise prematurely.

1 Like

This forces you to determine manually or in some other way determine how to treat some subdirectory. I think that is a bad approach because it introduces risks. When it comes to backups, adding risks is a no-no. :slight_smile: Further, as I wrote above, restic can very quickly determine if stuff has changed or not. There is no need to duplicate the effort of determining that pre-restic.

I agree. Just tell restic to back up the for under which you keep all your photos, and let restic do what it’s designed to do (which includes only backing up files that have changed).

Thanks @all !

So, I’ve tested and now it’s quite (very) fast even on the whole 500 Go. It’s very impressive. I think that my first attempt was without –parent latest.

  • First backup done in 6:12:15 ( 6 hours)
  • Backup with no change done in 0:23 (23s)
  • Backup with one new file dont in 0:22 (22s)

First backup command:

restic backup -r e:\restic-photos --compression off --verbose D:\Images

Next backups command:

restic backup -r e:\restic-photos --parent latest --compression off --verbose D:\Images

I noticed the “empty” second backup created a snapshot. Could you explain why ?

2 Likes

Search docs and read about –skip-if-unchanged flag

Because you asked restic to create a restorable snapshot of the backup set at that point in time, so that’s what it did. The design is centered around creating snapshots of your data along the timeline, regardless of what changed.

If you ask it to create such a snapshot, it does so, in the most effective way possible. That snapshot costs a few kilobytes on disk in the repository, so it’s not a problem.

You can use what @kapitainsky wrote to skip the creation of snapshots when nothing changed in your backup set.