Prune performance

Confirmed; it was because i was putting output to “tee” that was causing problem with prune appearing to not work. All good with the latest version and updates.

Hi all; did some further testing with some real loads.

Leaving for folks later on for info on how progress meter works (particularly with my backend - Onedrive for Business via rclone).

Happy to do more testing if anyone has suggestions.

The tl;dr - there is a step in middle (data files processed) where the progress seems to get stuck (see details below on lead up and output). The whole process did eventually work without a hitch but did take a decent while (total - 40 minutes with the no-progress sections accounting for 2/3 of that…)

===

delete snapsnots

~1800 snapshots

Got stuck at around ~400 or so - very fast 1 minute to 400 and then next 9 minutes to get to 402
Terminated via control c and the unlock and try again.

Next run fine:

[3:10] 100.00%  1429 / 1429 files deleted

collect data files for deletion and repacking

Had for a very long time:

0% 0 / 2428 

and then suddenly:

data file fbbcd5de is not referenced in any index and not used -> will be removed.
(((many lines)))
data file ff442e9d is not referenced in any index and not used -> will be removed.
[4:46] 100.00%  2428 / 2428 data files processed

repacking data files

Ok fast up initally but great slowdown around 5 mins; go from 56 to 60 over next 7 minutes

[11:36] 74.07%  60 / 81 data files repacked

deleting obsolete data files…

Instant to 69 but then:

[0:37] 51.11%  69 / 135 files deleted

took another 8 minutes to get to 74 files deleted but then ran really fast and finished

Your “slow” deletion is most likely a backend issues, especially if in combination with Onedrive. There are known issues with Onedrive, see e.g.

https://jaytuckey.name/2020/07/17/problems-with-onedrive-as-a-backend-to-restic-backup-tool/

To debug this you may want to run restic like this:

RCLONE_LOG_LEVEL=DEBUG restic ...

I’m in the process of reviewing online backup options and have settled on either B2/Wasabi as my backend. I was all set on restic until testing prune.

I’ve read through the comment history for several of the PRs (and this thread) and @alexweiss has done some incredible work here - looking forward to seeing it merged. I’m still not entirely sure whether these improvements will reduce the data transfer required for pruning when restic is paired with object storage such as B2/Wasabi , or just reduce the processing power required? - Is anyone able to confirm?

The main improvement for remote repositories is the reduced data transfer. The prune reimplementation uses the information that is usually locally cached (+ just the list of objects which exist in your remote backend) to determine what to do. Moreover you can trade used space for lower repacking rate (repacking implies reading and re-writing the contents) and fine-tune the pruning using various parameters. E.g. I’m using the new implementation with a cold storage where no file is read from the remote repository during prune (Note that I use another patch for this to fully work but this is about locks and key files and does not change anything about traffic in a measurable way).

Amazing, thanks so much for your hard work

@ArandoDrive - something to be careful about; wasabit has a min object storage retention period of 90 days; so if you end up pruning too quick; then it will actually cost you.

wasabi ingress/egress is free so the only “extra” cost is for storage deleted prior to 90 days (at the same rate as storage, 0.00016243 USD per GB per day). I use a --keep-daily 28 --keep-weekly 14 forget policy, but still end up with some deleted storage because or repacks, etc. The source for the backup involves the addition/deletion of 5 - 20 GB total per week (once per day backup), and the relatively balanced addition/deletion is not intentional, it just works out that way.

For my last billing period I had 2.99 TB total storage and 81.2 GB deleted storage; the costs for each in USD were 14.41 and 0.47.

Downloads for prunes were 14 - 16 GB each. Uploads for prunes varied between 35 - 92 GB each.

I use the rclone backend, and have had flawless operations with wasabi by dropping the prune from the forget operation; the sequence I use is forget, check, prune, check.

@kellytrinh Thanks - I’d spotted that but appreciate you flagging it as I’m sure many would easily miss it.

@doscott Thanks for your account of things, really useful. My concern with the current prune was around the bandwidth implications (due to my own limitations). My backups are for personal data that is pretty static, so I don’t expect to feel much pain for the 90 day cost. I’ll likely avoid too much pruning until the new changes are merged to master (where bandwidth will be less of an issue)

Thanks for all the great work done here. Looking forward to the faster prune. I have just had a prune complete using restic 0.9.5, which reduced a B2 repository from 46TB to 23 TB …

The output of my script shows …


Pruning XXXXXXXX restic snapshots … --keep-weekly 2 --keep-monthly 12

real 124864m1.270s
user 0m0.014s
sys 0m0.010s
Thu Oct 15 22:12:00 BST 2020

Pruning XXXXXXXX restic snapshots finished RC=0


124864 minutes is 86 DAYS!!! A long time to wait for the next backup!

1 Like

As


and

were merged, the improved pruning is now available in the latest beta builds and I removed the provided binaries in my private github repo.

Hope that the new pruning is getting widely used - and if you encounter an issue with pruning in the recent master branch, don’t hesitate to open an issue in github!

2 Likes

@alexweiss Nice job, thanks. Will do and test lasestt restic dev version.