Restic backup is slow

Hi,

Can anyone explain what the restic backup does? I have a collection of large files (~2GB each) and the first backup has completed fine. But when I run the backup again, it takes quite a bit of time to process the files, even though they have not changed at all. There doesn’t appear to be any network traffic, so its definitely not resending the files, but it appears to take some time to chunk them. Is this how restic backup is supposed to work? Surely if the file has not changed, restic should just skip it and not do any work at all?

Thanks,

-a

Hi @ala and welcome to the restic community :slight_smile:

First of all it would be good to know where your repository is located.
From your post it seems to be a repository which is offsite - NAS, S3, B2 or another cloud vendor.

What would be good to know as well would be the output of two backup commands directly after each other.
This would give us plenty of information to see what restic is doing.

Also the restic command 1:1 you’re using is also helpful for troubleshooting.

Another idea is to check out the forum for other posts in regards to slow backup: https://forum.restic.net/search?q=backup%20slow
There is a ton of information in there that could lead you to a fix for your issue or answers to some of your questions.

Hi,

I looked at some of the other posts, and they mention a few things, but nothing that solved the problem.

My repository is on linux, backup up to a remote linux server using sftp. I’m using restic 0.9.5 compiled from source: v0.9.5-46-g604b18aa-dirty.

I synchronised the clocks between the machines (some posts mentioned this), but it still appears that restic will scan every file in my repo. I expected restic to be able to check the last modified time of the file, and avoid reading it if nothing has changed.

The problem is that my repository is almost 7TB and it takes hours to check it and it blows up the iowait and makes the machine unresponsive.

This is the output of a backup: “restic -v backup -f --one-file-system --exclude-file=/root/excludes.txt /raid/home”

open repository
repository 3f87a6f5 opened successfully, password is correct
lock repository
load index files
start scan on [/raid/home]
start backup on [/raid/home]
scan finished in 222.579s: 4367733 files, 6.766 TiB
uploaded intermediate index e3dbec0e
uploaded intermediate index 7c34632f

Files:       4367782 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Data Blobs:    583 new
Tree Blobs:      3 new
Added to the repo: 186.598 MiB

processed 4367782 files, 6.766 TiB in 4:55:40
snapshot da49f157 saved

And you can see that only a small number of files have changed and a tiny amount of data is transferred, but it still had to read every file.

Is there a way to tell restic to compare timestamps before scanning the file?

thanks,

Can you show us the fstab entry for that /raid/home filesystem, and also the output for it from mount?

sure- its a raid array, fs is ext4:

$ mount | grep raid
/dev/md0p3 on /raid type ext4 (rw,relatime,stripe=384,data=ordered)

$ grep raid /etc/fstab
UUID=… /raid ext4 rw,relatime,stripe=384,data=ordered 0 1

I have no idea why your restic is apparently reading and thinking that all those files are new (not modified), if they haven’t been modified.

I’m thinking you need to isolate things here. A couple of suggestions to try:

  • Can you reproduce it with a small test set of folders and files? That is, create or copy some dummy files to another part of the disk, preferrably outside the raid (e.g. /tmp), and see if you can reproduce the issue there.

  • Try setting the noatime option on the mount of the /raid/home filesystem. I can’t say I’m expecting it to help, but just in case. See if the problem still manifests itself.

Also, silly as it may sound, if you run two backups, one right after the other, does the problem happen in both of them or just the first? How much time passed between the first one was started and the second one was started?

1 Like

Apparently, restic is unable to find a previous snapshot (called “parent snapshot”) for /raid/home, otherwise it would have printed something along the lines of:

using parent snapshot 4ad58bd9

That seems to be missing here.

We’ve had this issue several times already:

  • Sometimes it was caused by the path (/raid/home here) not being constant. Is the path exactly the same in between backups?
  • Is the host name always the same?

You can check both by looking at restic snapshots.

The path is always exactly the same and hostnames never change either. I ran it again yesterday and it took >5hrs. It finished with:

processed 4371290 files, 6.766 TiB in 5:16:14
snapshot 6912da37 saved

When I run “restic snapshots”, the last entry is this:

6912da37 2019-11-19 15:52:45 factorise /raid/home

So it seems to have created the snapshot and it is aware of it.

Is there something else I can do to track this down?

thanks,

You can try forcing restic to use a specific parent snapshot with the flag --parent (and provide the id of the latest snapshot) to see if something changes.

As suggested by @rawtaz, I would try to reproduce it in an smaller scale so then you can debug it quickly.

By using the -f flag you are forcing restic to ignore any parent snapshots and re-read everything.

The “-f” flag forces a complete rescan, which is what restic was doing.

I had copied the command from somewhere without checking all the options.

repository 3f87a6f5 opened successfully, password is correct
lock repository
load index files
using parent snapshot 6912da37
start scan on [/raid/home]
start backup on [/raid/home]
scan finished in 294.869s: 4371314 files, 6.766 TiB

Files:          25 new,    95 changed, 4371194 unmodified
Dirs:            0 new,     2 changed,     0 unmodified
Data Blobs:    126 new
Tree Blobs:      3 new
Added to the repo: 37.993 MiB

processed 4371314 files, 6.766 TiB in 12:23
snapshot bf4bdfef saved

If I remove the “-f”, the backup takes 12mins, and this is more what I expected, and completely fixes the problem.

Thanks for your help!

(you might add an option in verbose mode to indicate that a full rescan is being performed…)

1 Like

I feel one should have checked those args better before replying :frowning:

However, why is restic reporting that the files are new instead of unmodified?

Even if it re-reads them all, doesn’t it know that they are already backed up, and if they are indeed unchanged, should report them as unmodified?

I think: This is reported based on the relevant parent snapshot. By “-f” restic scan operates with no parent snapshot. So in this case during scan everything is “new” kind of “by definition”. Maybe the output can be modified to clarify this. (“based on parent snapshot XY” or alike)

From my understanding this part of the output is a result of the mere scan, not of the whole process.

1 Like