What happens if a backup is stopped?

Hello all.

I started using Restic just yesterday and I love it, it is exactly the backup software I have been looking for months.

Here’s my situation: I have slow internet - My upload speed is 10mbps (a bit more than 1MB), and I need to backup over 10TBs of data. To aid with this, I created an exclude file that excludes most of the folders in the root of my hard drive so it only backups one “big” directory at a time.

Because of my slow internet and the fact that there’s constant blackouts where I live, it’s possible that my backup may be interrupted for reasons external to me. What happens if a backup gets interrupted? Either because I accidentally Ctrl + C out of it, a blackout occurs, or I suddenly lose internet connection. Will the backup resume, or will the current snapshot be corrupted?

To be more specific I’m backing up to Google Drive, but I will diversify my backup to B2 and Wasabi eventually, through rclone.

3 Likes

Hi, welcome to the forum!

No matter how restic is interrupted, it will resume uploading roughly wehre it stopped. At most, the last 5-10 minutes will be re-uploaded, the next run of restic prune will take care of the additional data. It will, though, re-read all data locally to make sure it hasn’t changed in between.

Restic only writes the so-called “snapshot” at the very end of the backup run, but if you happen to need a file that has been uploaded but was not included in a snapshot yet: the recover command creates a new snapshot from all the files and dirs it can find in the repository which are not included in any snapshot yet.

I hope this answers your questions!

2 Likes

Is this even the case, when it left the repo in a locked state?
Due to a faulty exclude filter I had to Ctrl-C it. Then the repo was locked until restic unlock.

Has this been a situation restic can handle without leaving bogus data?

(And on a sidenote --dry-run on backup would be really great, +1’ing https://github.com/restic/restic/issues/1542)

Yes, even then.

Hm, I’m not sure I understand the question:

  • restic always worked this way, we haven’t changed that.
  • it’s not really “bogus” data, the first process uploads something and is interrupted, the second instance then isn’t aware of the last few minutes of uploaded data, so it’ll err on the side of caution and re-upload the data. So you’ll end up with a bit of duplicate data in the repo. Which isn’t so bad in general.
2 Likes

I can ensure you that this wasn’t my case. I accidentally interrupted the first backup 2 times. Once by BSOD and second time by putting the PC into hibernation mode. When i used the same command to continue in the backup, it went from the beginning. At first i thought it just reading the already backed up data but no, the est. time was the same as at first try.

I’m not sure if it is only my issue but it simply redid the whole backup. I assume i will free up hundreds of gigs by running prune, lol.

Since there is no existing snapshot in the repository at this point, restic cannot use that snapshot’s record of the files’ metadata to determine which files has changed since their contents was last backed up. This means that it has to scan all the files again, which is probably what you are seeing and what makes you think that it “went from the beginning”. However, this does not mean that it uploads all the files’ data again. Restic only scans all the files and uploads the data for their contents that it did not upload during the previous two backup runs you started.

So, it should all be fine. It’s just scanning files, but only uploading what hasn’t already been uploaded.

And yes, a prune would be good to run on the repository once you finished the first backup run and got a snapshot :slight_smile:

Just want to add that if you have very frequent interrupts and reading the 10TB takes a long time, then you can divide the initial 10TB in smaller tasks for restic. For example divide into 1TB x 10 jobs which will help to reduce a lengthy rescan.
After completion of the initial backup things will usually run much faster as not all files change between backups. Then restic should be able to handle 10TB (and less data to send = less chance to encounter an interruption).

From my experience restic is robust in interrupted situations, one of the main reasons for me to switch some years ago from “duplicati” to “restic”.