Hello,
I am currently using restic to backup my data offsite, on Google Drive (unlimited).
The main reason for using this backend is the “unlimited” bit.
Below are my stats and the commands I’m using to run restic:
With these times it’s pretty difficult to schedule, say, a weekly backup and prune.
So my question is, is there anything I can do to speed things up without complicating setup too much?
Thank you
Which version of restic is that? The reason I’m asking is because the master branch (and the next release) contains a rewritten implementation of prune which is much faster and also much more configurable. You can find a pre-compiled binary here: https://beta.restic.net/?sort=time&order=desc
It should be much better with the default configuration already, but in your case it may make sense to play around with the new --max-unused flag (e.g. try --max-unused 50%) and the --max-repack-size. If you do, please report back!
restic 0.10.0 compiled with go1.15.2 on linux/amd64
If 0.11.0 already includes the new prune method I can definitely test it.
I believe part of the problem is also Google Drive though, is there a strategy to cache more data on a different backend for example (e.g. B2) just for the purpose of the prune?
The new prune is not in 0.11.0, but in the latest betas.
If you do not care too much about some extra used space, I’d recommend to use --max-unused unlimited - this will be very fast as it only needs to access data that is stored in the local cache.
Prune also always shows how much unused space still remains in the repository, so if this is too high, you can rerun it with another value of --max-unused. Note, that there is also a --dry-run which shows what would be done.
This collecting packs for deletion and repacking message comes before a List command is sent to you storage backend to get the list of all pack files. I also realized that some backends do not send the results by parts but first collect all filenames and then send them in order to be processed. Processing here should be very fast as it is only very simple calculations.
So it seems that you were waiting for your backend’s List result here.
Also thanks for the report of prune for your pretty large repository (at least much larger than mine ).
From what I can see the time-consuming parts finished pretty quickly, so I would expect that this took much less than 12 hours…
As you could see, no packs have been marked for repacking (which would have been the very time-consuming part). So there is nothing for you to optimize if you are satisfied with the extra ~20GiB used space (which is 0.16% of your repo size). No need however, to give --max-unused 50% in your case, everything above the 0.16% would have given the same pruning result - I think you can as well work with the default (5%).
What I found irritating is that the List command and obviously the Delete commands of you storage backend took quite a lot of time.
Listing files was at the speed of ~750 files per second - quite low for just returning the file names stored in the backend. Deleting was about the speed of 1 file/s - even though this is already parallelized within restic.
So you might think about benchmarking this using direct access to your storage (instead of using restic) to see if this is a backend issue or whether something goes wrong within restic.
The backend is Google Drive, it’s not the best but it does have truly unlimited space for orgs.
How could I test it directly? Would rclone work or would I need to make direct API calls?
I’m not currently using a custom API key, just the default one, this might be one reason it’s slower than expected.
Here some output:
$ rclone ls uni-drive:anime | wc -l
503
$ time rclone ls uni-drive:anime
real 0m1.731s
user 0m0.221s
sys 0m0.056s
$ rclone ls uni-drive:music | wc -l
2156
$ time rclone ls uni-drive:music
real 0m57.045s
user 0m0.415s
sys 0m0.076s
$ time rclone ls uni-drive:music
real 0m4.793s
user 0m0.315s
sys 0m0.155s
The only outlier seems to be one listing for music which takes almost a minute.
It seems for some reason it hits some form of rate limit even for file listing.
As a sidenote, I found with the new version the output when running as a systemd unit, which during forget/prune operation, outputs a line for each second passed. I remember the behaviour being different before, with no “progress” output reported unless a HUP signal was given.
If you can list 500 files in 1.7s and 2000 files in 4.8s, then listing 2.523.000 files in 3419s sounds pretty reasonable to me… So this ~60 minutes for collecting packs must be due to your backend. And yes, this is slow. I did I test myself for my cloud storage and get more than 5000 files/s when listing.
You could add a test for deleting files which also seemed pretty slow to me, but chances are good that the 1file/s deletion rate again is due to your could storage.
So if you don’t want to change the storage provider, there is actually nothing you can improve. Once restic supports larger pack file sizes, you can think about using e.g. 64MB or 128MB as minimum pack size.