Restic 0.8 Prune still very slow

scruffydan · December 7, 2017, 10:46pm

I have a 270GB repository with 3 snapshots at Backblaze B2
Running restic prune with the latest FreeBSD binary 0.8 (downloaded from Github not the ports tree) takes roughly 12 hours.

Is this the expected speed of the prune command or is there something I can do to speed this up?

Will the length of time needed to run the prune command continue to get longer as the number of snapshots and as the repository size increases?

Thanks in advance

EDIT: I am quite sure I am using the local cache, but not sure how to confirm this.

fd0 · December 8, 2017, 8:09am

Hi, in order to diagnose what’s going on here, it’d be most helpful if you could paste the output of the prune run. There are several different stages, and the local cache improves not all stages.

Was that the first run with 0.8.0? Did you run a beta before? If this was the first run, restic untangled a few more files which usually takes longer. That happens only on the first run of prune, the subsequent ones should fully utilize the cache and be much faster.

You can check if the local cache is used by looking at ~/.cache/restic: Does it contain files? When you run a new backup, you should see new files being added there, in the snapshots and index folders.

scruffydan · December 9, 2017, 3:01am

I started the prune run last night. Seems to be taking longer that last time (there are a few more snapshots now)

Here is the output:

$ restic prune                                                        
password is correct                
counting files in repo             
building new index for repo        
[7:07:17] 100.00%  54714 / 54714 packs                                
repository contains 54714 packs (289757 blobs) with 265.188 GiB       
processed 289757 blobs: 0 duplicate blobs, 0B duplicate               
load all snapshots                 
find data that is still in use for 6 snapshots                        
[0:13] 100.00%  6 / 6 snapshots    
found 289757 of 289757 data blobs still in use, removing 0 blobs      
will remove 0 invalid files        
will delete 0 packs and rewrite 0 packs, this frees 0B                
counting files in repo             
[7:18:49] 100.00%  54714 / 54714 packs       
finding old index files                      
saved new index as 3947e733                  
remove 3 old index files                     
done

All backups and prune commands were done with 0.80 and I can confirm that the cache directory exists and files are created when running backups.

Thanks

fd0 · December 9, 2017, 9:02am

So, what happens here is that the first and last stages (rebuilding the index, listing the contents of all files) takes very long. That is a known problem, especially with B2, and I’ve got some ideas on how to mitigate it, but that’s not implemented yet. The local cache can’t really be used to improve this here, something else is needed.

scruffydan · December 9, 2017, 7:19pm

Well the good news is this isn’t my fault:)

Do you have an rough ETA on when you will be able to implement your ideas to speed up the prune operation?

Thanks!

fd0 · December 9, 2017, 9:06pm

I don’t have any timeline for it, sorry.

scruffydan · December 10, 2017, 12:33am

No worries, I understand.

Consider this a vote for implementing that feature as soon as is reasonable.

Thanks

wscott · December 11, 2017, 11:35am

I run nightly backups to remote servers and yeah prune is too slow to run every night. My solution is to run a prune only every so often. I start will weekly, but then I adjust the interval between prunes until prune is removing between 10-20% of the data for this backup target. For some relatively static data sets pruning one a month is fine. Others are weekly.

Prune is only to save disk space so don’t bother if you don’t need it.

However, it would be handy if I could tell restic to just ignore any stale locks older than a certain age. I keep having minor issues that cause a prune to be interrupted (no big deal), but then it also breaks the next prune because I failed to do the manual cleanup.

If I were building a backup system around restic I would embed this logic automatically.

eblau · December 14, 2017, 9:48pm

Hi everyone,

Given that prune is slow currently (I had prune running for almost 6 days on a 1.5 TB repository in B2 before I killed it), is there any way to tell the amount of space prune would save before actually running it? Could “forget” or some other operation tell how much space would be saved by a prune?

It would be useful to balance whether the cost savings from lower storage usage would outweigh the cost of all of the B2 transactions and time that prune requires to run.

BTW, thanks for the quality software. I’m in the process of testing restic with B2 as a replacement for CrashPlan. Other than slow prunes, restic looks great so far.

scruffydan · December 15, 2017, 1:52am

I am also looking at restic as a replacement for CrashPlan.

Another Prune related question, can I do a backup while a prune operation is running? If so the speed of prune, while still problematic, would be much less so.

fd0 · December 15, 2017, 3:19pm

Unfortunately there is not (at least not yet).

Thanks! We have several ideas for improvements for prune, although development on the most critical function in the whole program (after all, it really deletes data) must be done with great care. One error there may lead to severe data loss, which is unacceptable for a backup program.

No, that’s not possible: The prune operation deletes data, so for safety reasons the repository is locked (exclusively), and trying to run a backup will fail with an error message. We also have ideas on how to improve this situation, but it’ll also take some time to implement.

x572b · December 15, 2017, 4:27pm

I use restic to backup to a local extranal hard drive. I then use rclone to back up the external hard drive to B2. This makes time intensive operations in restic (prune for example) much faster because it is local.

I also don’t seem to incur any B2 transaction fees with this method.

I should still be able to mount the B2 in restic for a restore if my local external drive was unavailable for whatever reason.

May not fit your use case scenario, but just thought I would mention it.

fd0 · December 15, 2017, 9:46pm

Just to confirm: When you do this (backup into a local repo, sync that to B2) you can use restic to directly restore from B2, the file/directory format is exactly the same (for exactly that reason)

scruffydan · December 15, 2017, 10:25pm

This made a massive difference for me.

I am on a 1gbps connection and currently I find that -o b2.connections=32 works reduces the prune time to about 3 hours. Still could be optimized further, but I assume this is not easy since most backup software is slow when doing similar operations.

eblau · December 15, 2017, 11:47pm

@x572b Thanks for the suggestion to use a local external HDD and rclone to B2. I was thinking about giving that a shot after seeing rclone mentioned in another thread. It makes a lot of sense to try that.

@scruffydan Thanks for the suggestion on upping the number of B2 connections. I will try that as well. I have a 300 Mbps connection so maybe 32 or fewer will work for me.

hickinbottoms · December 16, 2017, 9:44am

I use restic to backup to a local extranal hard drive. I then use rclone to back up the external hard drive to B2. This makes time intensive operations in restic (prune for example) much faster because it is local.

A follow-up question to this, if I may, as I’ve been thinking about this approach too (my prune estimate to a remote Minio server looks like two-weeks…).

rclone will be deleting and adding files at the remote end, presumably, but do you know whether that leaves the remote repository in a broken state during that process? I’m expecting that rclone to take a long time on occasions and wouldn’t want the remote repository broken during that time in case I need to rely on it.

Maybe if rclone adds all new files before deleting the ones no longer needed then that would be OK (which seems to be the approach of prune itself, from observing what happens when prune is terminated), but I don’t know enough about either rclone or restic’s prune strategy to be sure.

(I’ve been very impressed with restic to date – the combination of deduplication, hostname and custom-tagging of snapshots and the fuse driver fit my needs for backing up a cluster of servers in a homelab very well).

wscott · December 16, 2017, 11:25am

fd0 · December 16, 2017, 2:31pm

rclone should be configured and run so that it only syncs from the local to the remote directory, including file deletions. If that’s the case, the repo will be fine once the sync is complete.

hickinbottoms · December 18, 2017, 10:12pm

Thanks – yes forcing one-way sync would be my plan.

I’m just a bit concerned over the “…fine once the sync is complete” caveat as I think that may mean that a failure during the sync (which in my experience could take a very long time) may leave the remote repository unusable.

I’m OK putting off the prune operations for now while I think some more – I coincidentally drove to my to where my remote repo is (Minio on a RPi Zero) is this weekend with a laptop, hooked up the drive and ran the prune in a few hours.

Many thanks for restic – have been very impressed and the fact it’s a single static binary is a real plus for me.

fd0 · December 19, 2017, 7:52am

Yes, as long as the sync (which you can just restart) hasn’t finished successfully, the remote repo may be in an inconsistent state. restic itself tries to avoid that by carefully choosing the order in which files are written, but rclone don’t know anything about this.