Clarification on 0.9.0 'prune' slowness

framagno · June 10, 2018, 4:31pm

Hi fd0, and hi everybody.
I want to thank you for the amazing job you are doing. I consider restic a very beautiful piece of software, it works like a charm with rclone (wich I really love, too), and I’d really love to throw away my current setup (duplicity with a custom rclone bridge writtem by be, putting data on Google Drive) in favor of restic.

I have been playing with restic for some weeks, and I will run in parallel to my current setup for some months before I will feel confident enough to make the switch.I currently keep backed up about 4.5 TB of data; my data is almost static, a file can be added or deleted, but almost never changes, so I don’t actually need deduplication, but I love the modernity of the approach, and the outdoing of “full/delta” concept.

Of course, with this huge amound of data, pruning is an issue. I did some studying on the forum, and I think to have the basic comprehension of why pruning is such a delicate operation and why is slow. But I could not figure I would be so slow. When doing my first backup, after 1.2TB of data I got a network failure, and I aborted the operation, so when the networking came back the first thing I did was a prune, to remove files from the partial snapshot.

I am running restic 0.9.0 on FreeBSD 11.1-RELEASE-p10. Those are the logs.

[root@DarkSun:/store/maintenance]# RESTIC_PASSWORD=xxx RESTIC_REPOSITORY=rclone:drive:restic restic --cache-dir /store/maintenance/restic_cache --cleanup-cache prune -v
repository 1cf05352 opened successfully, password is correct
counting files in repo
building new index for repo
[27:45:20] 100.00%  336964 / 336964 packs
repository contains 336964 packs (1421154 blobs) with 1.565 TiB
processed 1421154 blobs: 0 duplicate blobs, 0B duplicate
load all snapshots
find data that is still in use for 12 snapshots
[0:05] 100.00%  12 / 12 snapshots
found 1019138 of 1421154 data blobs still in use, removing 402016 blobs
will remove 0 invalid files
will delete 70942 packs and rewrite 0 packs, this frees 312.236 GiB
counting files in repo
[21:32:34] 100.00%  266022 / 266022 packs
finding old index files
saved new indexes as [c780b3d3 9ad67223 23a9c79d a6ff3774 ff21f497 81996b90 c346afed 48aac91d ec6c72e4 efd14db2 cc926495 eb2e0f91 59bf14e5 61f5fb4a f58d0f82 7d0e0309 b84dd356 38cf46e2 c248140a ca8786aa 36addf68 1fe7553c 580f9b1a 0d752b86 aabc2d06 c9e506f3 d4e3a800 3c5da5ae 1239311f 0fa8dc05 ab69b783 b35ce4fd 3d93d4db 06d34bbb 70e049b0 db81bebd f435a0b3 6a7900bc 021470a9 81b881be 570e22e6 c46c0e0f aae66188 f1f60331 f466b4c3 94434465 920b577e bace6e8b 0c9384cf a5c4ecd3 603af255 07e72bc7 ab3db367 24cd5376 b6d815e0 87da90e1 26368d10 297c9fdb 59eaf46a 2f0a72f4 170ab129 14c3c58f e0302184 2e512240 5d6b8d30 6367e661 1975020f f108218b b5ee383d 52118fa4 a2e1f795 9373d54b 74b546e9 b456ba9f 8b9e0126 5cf8053b 47cdb8ff d222531a 494dc525 605f4e4e f22e91d6 214ea60d c2cbcb9c 719a1ab8 7ae2d02c f0fb3447 86b901c6 d5402d4b ea02f4f8]
remove 666 old index files
[14:49:05] 100.00%  70942 / 70942 packs deleted
done

Notice the 27 hours step to rebuild the indexes, another 21 hours step, and a final 14 hours step to actually delete files.
In a such scenario, I thought that prune would actually be pretty fast, I assumed that should be simple to detect ‘orphan’ packs, that do not belong to any snapshot, but seems it’s not the case. And I clearly can’t understand what happened later.
A few questions (some because I am probably biased from the knowledge of other software):

Why restic has to scan all the packs? I assumed that such operation should rely on indexes, that are only a few megabytes, and can be kept in the local cache. Since prune is so delicate, you may not trust local cache and download it fresh, but this is still a small fraction of whole data.
What is restic actually doing in this step? My monitoring tools did not record any significant network activity in those hours, nor CPU time. We know Google Drive is very slow with lots of operation on small files, but seems restic is not really downloading anything.
If restic rebuilt the whole index in the first step, and correctly identified the 70942 packs / 402016 blobs to remove, what is the purpose of the 21 hours step? In this step, like the first, I noticed almost no network activity, so what is it actually doing?
The last 14 hours step it not really anybody’s fault, just Google is damn slow in deleting lots of file. I’d really love a configurable chunk size (and I’ve seen some feature request), since in my use case I don’t care about loosing some deduplication in favour of having 1GB files, or even bigger, on remote remository. I really look forward this feature.
Doing some test on local storage, it seems that ‘forget --prune’ it’s just an alias for ‘forget && prune’. This means that every time a crontab job will forget some snapshots older than retention time, restic will do this 3 days process? This is a problem for me. But since restic knows what is deleting, can the pruning process be more efficient if done while forgetting, rather than doing is later on?

I wish to say that this post is not anyway a criticism. I just want to understand better the internals and maybe give some hints to make restic more suitable for everybody (I am a software engineer myself, but in a totally different sector).
You are doing a great job and I really appreciate it! Thank you, and thanks if you will answer my question, and help me.

Cheers

fd0 · June 10, 2018, 5:51pm

Thanks for all the praise I think in the first few paragraphs you mentioned “attic” in a few places where “restic” would have been correct… ok, here we go, I’ll try to answer your questions:

Ah, you could have just left the additional files alone, make a backup, and only afterwards run prune. This would have saved you from re-uploading most data. Restic will recognize data in the repo even if it was uploaded during an interrupted backup and isn’t referenced by any snapshot yet.

You’re right, it’s a cheap operation (at least with the local metadata cache, which you don’t have to worry about as a user because it’s managed automatically for you).

I originally implemented this so that restic doesn’t rely on any index in the repo at all, but rather builds a new index by loading the header of each file in the repo. This is meant as a safety feature: no matter what the index files in the repo say, prune will always have an accurate overview of what’s really in the repo. I have plans to change that (use the index files in the repo instead of building a new one, maybe with additional plausibility checks), but that’s not done yet.

Restic is first listing all files in the repo in the data/ subdir, then for each file it’ll download the last few hundred bytes which contain the header. All this is done sequentially (also something which will need to be optimized). In your case, it’s mostly waiting for data I suppose.

Another implementation decision tuned for safety: in this step, restic scans all files in the repo again, minus the ones it’s going to delete anyway. So in your caes, for most files, the header is read again (sequentially).

Ah, this process is also sequentially and will benefit hugely when we delete files in batches or concurrently.

You’re right, forget --prune will run the (lengthy) process every time a single snapshot is removed. That’s not great for you, I agree. I’d suggest running forget regularly, and only run prune once in a while (that’s the reason why forget and prune are two different operations). It sounds like in your case, that’ll be just a tiny bit of data anyway, so why not leave it on GCS a bit longer?

The issue is a bit more complicated, and you haven’t even discovered an important downside yet: while prune is running, no backup can be made because the repository is locked exclusively. We’ve discussed potential solutions for this in #1141, but please don’t add anything to that issue if you can help it, it’s too long already…

Understood, thanks for pointing it out explicitly! I think it’s important to be honest about a project’s limitations

framagno · June 10, 2018, 6:55pm

I think in the first few paragraphs you mentioned “attic” in a few places where “restic” would have been correct…

You are absolutely right, I edited my post, and I apologize for that (the names sound so close).

In your case, it’s mostly waiting for data I suppose.

Yes, I suppose this is my case.

The issue is a bit more complicated, and you haven’t even discovered an important downside yet: while prune is running, no backup can be made because the repository is locked exclusively.

I knew that, but I was not worried about locking itself, but about time, because prune time is proportional of whole archive size, not to the data you are about to remove. So basically in my scenario it’s way better to forget frequently and prune once in a while.

So basically I can summarize that my issues are caused by using a slow (in terms of IOPS, not bandwidth) remote storage. This could be mitigated by parallelism, and by allowing custom sized packs. I’d really love to see this feature sometime in the future.

I think it’s important to be honest about a project’s limitations

Every software has. But this project is still young, and the journey has just begun.
I will surely continue to use it, maybe I will have to wait some more time before dropping duplicity.

Keep up the good job, and THANK YOU (for your reply, too).

cfbao · June 11, 2018, 4:09pm

How is this done with rclone backend? I know many cloud storage providers support partial downloads, but I can’t find any options in rclone that supports this.

fd0 · June 11, 2018, 7:04pm

The protocol between restic and rclone is a special one which allows that (see here). And rclone can pull only a part of a file from any backend, so that’s fine too.

As far as I can see, all cloud provider backends support partial file download out of the box, even strange ones such as BackBlaze.

dionorgua · June 13, 2018, 8:25am

Btw, @framagno if you have enough space to backup locally (or to machine with fast access) it’s MUCH MUCH faster to let every host backup to such ‘fast’ location and run such expensive operations like prune locally. And for offsite backup just use rclone to upload local repository to any supported remote.

framagno · June 13, 2018, 8:45pm

I know @dionorgua (i did a lot of local testing), but I already have local snapshots with ZFS, backup is intended to be off-site.
Allocating 5 more terabytes just to go faster when pruning is not an option, at the moment, and a VPS with 5+ TB of storage accessible via ssh (wich would be a lot faster) is a bit pricey.
But slow prune is not scaring me, I will use restic anyway from now on, hoping that, some day, we’ll have larger pack size and parallel file check.

pkpowell · October 15, 2018, 2:53pm

Here are some stats for ~10TB files macpro osx 10.8 > synology.

initial backup was ~3 days
daily backup is about 30 mins

I started restic -r rest:http://fileserver.local:8000 forget --keep-daily 7 --keep-weekly 5 --keep-monthly 12 --prune a week ago (currently at 2.66%, see below)

What can I do to speed things up?

103 snapshots have been removed, running prune
counting files in repo
building new index for repo
[23:35:02] 100.00% 2081854 / 2081854 packs
incomplete pack file (will be removed): 6c0ee384b43d014ca3b24b8d3c8835a062c27a2248e3305d69fdaf251df36b5b
incomplete pack file (will be removed): 700431db1caaf03038c0dfc2271b8be41c93ce3efe897156b525afbd189796e6
repository contains 2081852 packs (20304089 blobs) with 10.278 TiB
processed 20304089 blobs: 175 duplicate blobs, 279.042 MiB duplicate
load all snapshots
find data that is still in use for 51 snapshots
[5:34:41] 100.00% 51 / 51 snapshots
found 8907348 of 20304089 data blobs still in use, removing 11396741 blobs
will remove 2 invalid files
will delete 321160 packs and rewrite 158484 packs, this frees 1.841 TiB
[132:54:59] 2.66% 4212 / 158484 packs rewritten

764287 · October 15, 2018, 3:33pm

Wait for #1194 to be merged. This PR lets you adjust the treshhold for pack rewrites which is the most time consuming part of prune.