Restic for large backup / db

Hi there,

I’ve just started using restic on a number of smaller projects. I’d now like to use it with a much larger client account, to back up a 3GB mysql database and around 600,000 individual files taking up almost 200GB.

My usual setup
restic backup /path/to/lots/of/files
mysqldump | restic backup --stdin --stdin-filename db.sql

Is this setup ok for larger projects or do you have any advice on how to optimise it? Should I for example run mysqldump first, then compress the file, before backing it up?

Also, what sort of impact on the server can I expect while running this job? Will it noticeably affect performance?

Edit: finally: I understand that restic keeps a file cache. How large are the storage requirements for this file cache, compared to the files being backed up?

Many thanks - Nils

Hey, welcome to the forum!

I personally don’t have any experience with saving large mysql DBs with restic (yet). So if you try it, please report back!

Compressing the file is probably not a good idea, then the deduplication won’t be able to work because all the data will change.

Edit: finally: I understand that restic keeps a file cache. How large are the storage requirements for this file cache, compared to the files being backed up?

Restic keeps a metadata cache, that’s correct. The size depends on the number of flies/dirs, and the complete file size. I’ve read reports of 1% to 10% metadata (=cache size) relative to the data size to be saved.

1 Like

There’s a parallel thread in part about backing up MySQL dumps with restic, see: Side by Side with Duplicity for a Large MySQL Database

1 Like

Thanks for all the input so far.

I’ve now had a chance to experiment. Restic took 24 minutes to index the 600k files. However we back up to OVH and the backup failed repeatedly with 503 error messages and ‘key already exists’ error messages, similar to https://github.com/restic/restic/issues/1375

Tests with smaller directories worked fine, so it seems to be linked to backing up so many files to OVH in one go.

I really like restic but don’t see a way forward at this stage for such a large backup. Will try some other libraries.

Hey, thanks for the feedback. Which version of restic did you use for the tests?

The problem here isn’t restic as such, but the OVH Swift service. Swift seems to be one of the “eventual” consistent backups, so when restic issues a DELETE operation to remove some file in order to retry uploading it, the deletion may happen at some point in the future. When the next upload process is started, the file is still there, so the upload fails.

We mitigated this in recent restic versions (especially 0.8.3 contains many fixes in this regard), but it may very well be the case that the OVH Swift service is not suited well for restic. With other Swift based services (e.g. the one from memset) it works great!

We test all our backends against live services, for a long time we used the OVH service for the that. We had many problems with flaky tests of the Swift backend, especially deleted files weren’t really removed for several minutes. When we switched to memset for running the tests (they support us and provided a free test account!), the flakyness was completely gone.

The Swift backend (especially with OVH) is one of the most problematic backends.

Did you give a different service a try? BackBlaze B2 or one of the S3 based services (e.g. AWS) are pretty popular and work very well with restic…

1 Like

Thank you so much for your input. I was using 0.81 but have tried again with 0.83, unfortunately with the same results.

panic: client.PutObject: HTTP Error: 503: 503 Service Unavailable

In addition a lock remains on the repository afterwards and, after unlocking, the check command yields a lot of errors like this:

pack f7844864fa0a25e9a6dc4312ef32317a05ed81a2745346280bde0000bed6: not referenced in any index
check snapshots, trees and blobs
Fatal: repository contains errors

We are unfortunately tied to OVH for the time being, so I’m not sure I can improve on the situation at the moment.

Hm, that’s a bummer. The panic will only occurs after restic gave up after several tries uploading a file.

The errors in the check command are not critical (there’s more data saved in the repo than necessary), it’s cleaned up on the next run of restic prune.

I’m sorry that it doesn’t work so well for you.