Performance differences between restic dump/mount/restore

Are there any resources that discuss the different performance levels of the dump, mount and restore options?

We’re backing up different projects/servers in separate repositories and the backup and restore process works fine however occasionally we want to generate a downloadable tar from a snapshot.

Currently we mount the repository and then tar all files and pipe it directly into minio, through which we then generate a download link. Which most often is rather slow.

In this case the project server, the restic repository and the download server have to be separate and we do not have enough space on the project server to first create a local tar which we can then upload to the download server.

If we restore a backup to the local server it’s pretty fast, however if we use mount or dump to copy the same files it’s an order of magnitude slower.

I just ran a test with a rather small testproject, the restore finished in ~17 seconds and the mount and dump options both took roughly 3,5-4 Minutes. All of them restored/copied to the local filesystem to take the additional upload out of the equation.

The commands for this test were rather simple:

restic restore --target=restictmp_restore [snapshot id]

restic dump [snapshot id] / > restictmp_dump.tar

restic mount /tmp/restic &
rsync -avhP /tmp/restic/ids/[snapshot id]/ restictmp_mount/

Is this to be expected? And if so is there any way to use the restore command to directly pipe all files into a tar file?

restore and mount/dump use rather different strategies to download data from the repository. restore first plans which files to download and then downloads everything in a few requests as possible. This also means that restore may restore file parts in a seemingly random order. As files are not restored in any particular order, I don’t think its possible to directly pipe the output into a tar file.

mount on the other hand has to download each file individually when these are accessed (and all parts of the file in order). fuse afaik uses some sort of readahead to optimized that a bit, but especially small files will still be slow.

dump currently also works file-by-file, so its expected to be rather slow. However, as dump knows from the start which files to output, it would be possible to optimize the command to download chunks in parallel and to keep a temporary scratch space to reuse deduplicated files parts.

So the performance slow-down is sort of expected. It probably grows larger the higher the latency between the host running restic and the storage backend is.

2 Likes

Thanks for the information, looks like we might need to look for a workaround then until the performance of dump increases.

ps. i’ve selected your answer as the “solution” as the behaviour is currently expected and there is no real “solution”

You could open a feature request on github. I’m currently not aware of an issue regarding the performance of the dump command.