Checkpoints for large snapshots

covici · July 13, 2019, 3:18am

Hi. I want to backup over a terabyte, but it says in the manual that the snapshot is not finalized till the end, so what happens if the backup fails in the middle, due to loss of internet, power hit, etc. Does it have to start all over again, or are there checkpoints of some sort, so it can resume?

Thanks in advance for any suggestions.

cdhowie · July 13, 2019, 8:23am

Packs are typically in the 4MB-8MB range and are uploaded as the backup progresses. Periodically, and if restic is gracefully terminated (SIGINT, SIGTERM), it will write out a new index file for all of the content added so far.

Initiating a new backup after one has been interrupted should therefore deduplicate against the already-uploaded data. However, note that if this is the first snapshot, the backup client must re-hash all files locally. It will then skip uploading blobs that were previously uploaded.

Additionally, note that pruning the repository will discard all data that isn’t referenced by a snapshot, so pruning will effectively destroy the “resume data.”

(If your network connection is interrupted, restic will retry any failed operations for quite a while. If this is your only concern, you should be fine – restic is quite stubborn about trying to recover from failures.)

covici · July 13, 2019, 2:05pm

Thanks for your quick response.

What happens if the local system crashes, so that restic is not
terminated normally? Will it skip the already uploaded blobs or what
will happen?

cdhowie · July 13, 2019, 4:36pm

I believe that restic writes out indexes for newly-uploaded packs every 5 or 15 minutes (I can’t remember which). So, any blobs that were uploaded before the latest index upload will not be re-uploaded. If restic is terminated abnormally (SIGKILL, power cut, crash, etc.) then data that was uploaded after the last index upload will likely be re-uploaded.

A future prune operation will remove the duplicate blobs from the repository.

If you know that a crash happened you can run restic rebuild-index first to forcibly update the indexes. Note that this could very well take longer than re-uploading a little bit of data!

covici · July 14, 2019, 5:41am

OK, thanks very much. Sounds like I should take this for a spin.

764287 · July 14, 2019, 2:18pm

cdhowie:

Packs are typically in the 4MB-8MB range and are uploaded as the backup progresses. Periodically, and if restic is gracefully terminated (SIGINT, SIGTERM), it will write out a new index file for all of the content added so far.

Initiating a new backup after one has been interrupted should therefore deduplicate against the already-uploaded data. However, note that if this is the first snapshot, the backup client must re-hash all files locally. It will then skip uploading blobs that were previously uploaded.

Additionally, note that pruning the repository will discard all data that isn’t referenced by a snapshot, so pruning will effectively destroy the “resume data.”

(If your network connection is interrupted, restic will retry any failed operations for quite a while. If this is your only concern, you should be fine – restic is quite stubborn about trying to recover from failures.)

Questions like the above seem to come up quite often. Wouldn’t it make sense to add a section with your answer to the documentation?

Ataraxy · August 5, 2019, 10:22am

I’m not seeing this behaviour in the output:

repository 86b7fabe opened successfully, password is correct
  signal interrupt received, cleaning up

I waited until about 10 files had been uploaded before pressing ^C.

cdhowie · August 5, 2019, 4:35pm

restic backup does not output anything when it writes an index file. Check if one was created in the repository storage.

Ataraxy · August 8, 2019, 6:08am

Hmm, which version are you running?

In 0.9.5, I see restic write a message when uploading a partial index (every 15 minutes).

However, I don’t see anything about a partial index upon SIGINT.

cdhowie · August 8, 2019, 3:02pm

I am also using this version. I regularly perform very long backups and I’ve never seen any messages about uploading indexes. What flags are you invoking restic with?

doscott · August 8, 2019, 10:26pm

This is an example of what I believe is being referred to. You only see intermediate indexes written when larger amounts of data are being backed up. This example has one, but I have encountered cases where several were written.

open repository
lock repository
load index files
using parent snapshot 457d2c24
start scan on [/etc /home /srv /media/lts /media/music /media/tv /media/movies]
start backup on [/etc /home /srv /media/lts /media/music /media/tv /media/movies]
scan finished in 13.019s: 125049 files, 2.221 TiB
uploaded intermediate index 25cadd90

Files:          85 new,    60 changed, 124904 unmodified
Dirs:            0 new,     1 changed,     0 unmodified
Data Blobs:   2169 new
Tree Blobs:      2 new
Added to the repo: 2.876 GiB

processed 125049 files, 2.221 TiB in 5:18
snapshot 1dcfaa3a saved

cdhowie · August 9, 2019, 12:47am

@doscott I have not seen uploaded intermediate index before… nor have I seen load index files. Is this by chance running with -v?

doscott · August 9, 2019, 10:33am

Yes, it is running with --verbose.

Ataraxy · August 9, 2019, 3:38pm

Yes, that happens every 15 mins with -v, but there is none printed when ^C is pressed.

A necessary work-around I tested was to run restic-rebuild-index which will allow the blobs from the partial backup to be index-accessible and therefore not uploaded again.

Of course, restic-rebuild-index may take more than 15 minutes, so this may end up being a waste of time.

I raised a feature request here.

cdhowie · August 9, 2019, 4:03pm

Thanks for creating the feature request. I was under the impression that this already happened.