Really slow restore when having a around 4k very small files

So i have been backing up to Azure blob store located in the same city as my servers. The backup seems quite fast and i have about 5gb of backups for 2.5gb of data due to history. This issue is the restore. For the larger files the restore flies and i can quickly restore about 1gb of data. For the next 500 meg it gets slower and after about 2 minutes from starting i hit the 1.5gb mark and the crawl starts. I left it for over an hour and only got to about 2gb restored before i gave up and terminated the instance. The instance i was restoring onto was running local SSD and on a 10gbps internet connection.

For me the restore speed is a major issue as i planned to use restic in my automation process where i run a single instance of my software in 1 of 2 locations and the instance can get stopped and backed up and then restored in the other location and resumed. I also backup while things are running every hour or so. The issue is that i expected the restore to take minutes not hours.

Is there any good ways to speed this up? i notices that azure has about 900 file for restic. I am wondering if there are any commands to tell restic to download all the files related to the snapshot event if you don’t need all the data and restore using that. It seems like restic is making a hell of a lot of calls to Azure for the data. I set the cache folder thinking that could be the issue but restic did not seem to put anything there for a restore. If i had to choose between a little extra data usage or a 2+ hour restore for under 3gb data then i would chose the download of the extra data.

One idea i thought may help is to download the whole blob container from azure and then try to use that as a local restore location for restic but i am not even sure if that would help or the best way to do it.

There is a known bug the affects restore performance for large files, but I am not aware of problems restoring small files in the latest release (0.9.6). If you can provide a sample repository I can use to reproduce the problem locally, I can have a look (5GB is relatively small and you should be able to find free way to share it). Before you do, however, I suggest you try restore restic built from using https://github.com/restic/restic/pull/2195, which has much reworked and much simplified restore implementation.

sadly i can’t share my current repo that is causing the issue due to the data in it. i am currently using resticker docker image to use restic so i just need to work out how to build it with your branch instead.

I will also try to get some verbose logging on to try to find out when and where it is slowing down as maybe i made the wrong asumption it was all the small files causing it if you are saying that large files can cause it.

@ifedorenko they need your change in master ASAP. 0.96 on a 10gb connection was taking over 2 hours before i gave up. Tested your version locally on a 100mb connection and it maxed the connection the whole time and finished in 5 minutes.

1 Like