High Memory Usage on Backup, Prune, and Check

jimp · December 26, 2018, 10:34pm

I am new to Restic, coming from a rsync (with hardlinks to simulate snapshots) to a local SAN. I’m using Restic with B2 to store web and email data, currently full system backups. I might change to per-account Restic repos since deduplicaiton doesn’t seem to be a game changer–the first backup stored 61G in 53G. I think compression would be the way to go if it were possible, as I’ve seen good compression with disk-level compression with my other backups.

I have only tried Restic on one VPS with 4GB of RAM, which holds 1.6 million files. The large portion of that is in IMAP Maildir folders. Unfortunately, Restic did not succeed before I added a 4GB swap file. Watching the process in top shows the memory usage exceed 51% at its peak. ~4GB of memory for a backup process seems way too high to me. It appears the memory usage was the worst with the largest Maildir folders, as some contain over 100k files.

Can Restic be tweaked via CLI arguments to be more memory efficient, even if it increases the backup time some? I am concerned my dataset is a bad fit for Restic. Thanks for any help you can offer.

cdhowie · December 27, 2018, 7:56am

Deduplication is not going to change single backups much, as there isn’t much duplicated data on a single system. Deduplication is most effective at reducing the amount of new data that must be added when a similar backup already exists.

Reserve your judgement until you have taken a second backup, as that’s where deduplication will be the most beneficial. Some stats from repositories I control:

Our production repository is storing 810 backups (from 10 daily full-system backups and 5 daily database backups) with a combined logical size of 8.4TB in a 152GB repository.
My home LAN server has 15 full-system backups with a logical size of 1.2TB in a 108GB repository.

This does not surprise me too much. The way restic stores directories is to create a “tree” object that contains the names of all contained nodes, with a reference to the tree object (for subdirectories) or blob objects (for files) that can be used to restore their contents. I believe these objects are created in memory before being written out to a temporary pack on disk, which means a single tree of 100k files is going to require a decent amount of memory to build.

However, 4GB is a pretty high ceiling even with those numbers. @fd0, do you have any insight here?

I would suggest trying a backup with the --quiet flag. IIRC this disables the scanner, which is only used to show progress. This could significantly reduce the amount of memory required to perform a backup.

moritzdietz · December 27, 2018, 4:18pm

I did some searching and you can find related threads with the same question: does not seem to be a super high amount of RAM depending on how many files you use.

https://forum.restic.net/search?q=ram%20usage

jimp · December 27, 2018, 4:23pm

I have observed subsequent snapshots taking far less disk for sure. But I’m thinking separate hosting accounts with practically unique data will see similar disk usage either as one big repo or separate ones. I guess I should try it both ways and compare the savings.

I just tried it and it appeared to peak at 48.9% in top. Down from ~51%, but still very high.

jimp · December 27, 2018, 4:26pm

Thanks. I did a search and saw a few older threads from over a year ago, so I was hoping maybe improvements have been made and I am doing something wrong. I forgot to mention I’m using Restic v0.9.3.

jimp · December 27, 2018, 4:37pm

Here is my backup command:

# restic backup --verbose --verbose --exclude "/proc/*" --exclude "/sys/*" --exclude "/dev/*" --exclude "/run/*" /

I think it’s odd that even from the very first backup, Restic thinks I have 240G to store, but df reports 61G.

# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        80G   61G   20G  76% /

# restic backup ... /
processed 1540357 files, 240.446 GiB in 10:35

I see the correct backup size in the progress, but suddenly it jumps from ~50G to 240G with 80% to go. Then it counts up to the completion. Maybe a bug?

cdhowie · December 27, 2018, 10:53pm

Drop these and use --one-file-system instead. It’s more resistant to the introduction of additional virtual filesystems by your distro. Then, obviously, you have to explicitly list each filesystem you want to back up, if / is not the only one.

Do you have any sparse files? IIRC restic does not handle sparse files specially, but df will not include the size of any holes in sparse files as used. If you have sparse files with very large holes in them, restic can deduplicate the chunks of zeros, however. Just because restic has found 240GB of data doesn’t mean that it will be adding 240GB of data to the repository, because this stat is taken before deduplication happens.

Another possibility is that you haven’t excluded some virtual filesystem (another reason to use --one-file-system) and that virtual filesystem contains about 180GB worth of data.

Edit: I’m not 100% sure how the scanner treats hard links (they are backed up and restored correctly, though) and so it’s possible that the size of the scanned data could be inflated by hard links, but, again, they will be correctly processed during backup.

jimp · December 28, 2018, 7:20pm

I was not aware of any spare files, but it looks like you are correct. My system has a “185G” sparse /var/log/lastlog file. I excluded that file, and the backup size is actually a few GB less than df now.

Excluding the sparse file did not reduce the memory usage, though.

fd0 · December 31, 2018, 4:41pm

That’s not the case: The scanner is still run (it used to be disabled back in 0.8.0 when backup would wait until the scan has finished)

I did not receive any reports that disabling verbose output (or the scanner) improves memory consumption. Is there something I’m missing?

That’s the case, and this can contribute to high memory usage. However, I observed that memory usage is proportional to the number of blobs (parts of files) in the repo. If you have a few very large files, restic will split these into large blobs, so the repo won’t have so many blobs. On the other hand, when you have many small files you’ll end up with many bobs (at least one per file), and this tends to use a lot of memory. I suspect the index data structure restic holds in memory may be the cause, this is something different than the index/ files in the repo.

I’m not sure this is the case, where did you see that? Sure, the strings used for displaying the status information doesn’t need to be garbage collected after use, so it can somewhat reduce the memory usage, but I doubt it makes a dent in the overall memory usage…

Indeed, restic’s scanner doesn’t care about hard links, so for a 100MiB file which has a hard link in the backup target it’ll report 200MiB to do.

cdhowie · December 31, 2018, 5:47pm

Unless I’m misunderstanding this PR, didn’t it disable the scanner when in quiet mode?

github.com/restic/restic

Skip archiver.Scan before backup when --quiet is set

restic:master ← bowensong:quiet-skip-scan

opened 08:48PM - 20 Mar 18 UTC

bowensong

+49 -4

### What is the purpose of this change? What does it change? This PR ensures th…e backup command skips the `archiver.Scan()` before the backup actually starts when the quiet flag `-q` or `--quiet` is set. The reason behind the change is because the scan result is only used for displaying the progress bar and ETA estimation, but the scan result is not used when the quiet flag is set. By skipping the scan before backup, the backup speed improves on large directory trees, especially when the entire directory tree cannot entirely fit into page cache, in which case the directory tree was twice read from the disk before this change is made. ### Was the change discussed in an issue or in the forum before? See issue #1160 I don't think this PR completely closes the above issue. This is only a quick change to improve the backup performance in some use cases with a minimal change in the code and logic. ### Checklist - [x] I have read the [Contribution Guidelines](https://github.com/restic/restic/blob/master/CONTRIBUTING.md#providing-patches) - [x] I have added tests for all changes in this PR - [x] I have added documentation for the changes (in the manual) - [x] There's a new file in `changelog/unreleased/` that describes the changes for our users (template [here](https://github.com/restic/restic/blob/master/changelog/changelog-entry.tmpl)) - [x] I have run `gofmt` on the code in all commits - [x] All commit messages are formatted in the same style as [the other commits in the repo](https://github.com/restic/restic/blob/master/CONTRIBUTING.md#git-commits) - [x] I'm done, this Pull Request is ready for review

fd0 · December 31, 2018, 6:49pm

Yes, but the code it changes was replaced with the new archiver code in 0.9.0 and does not exist any more (in that form)

cdhowie · December 31, 2018, 6:50pm

Ah, that makes sense. Thanks for the correction.

jimp · December 31, 2018, 7:33pm

Most of the files are small in the Maildir folders. 1KB-30KB is typical for most email messages. But it sounds like the file count itself is the leading cause? I cannot do much about that, except take smaller backups with a snapshot per mailbox.

fd0 · January 1, 2019, 4:05pm

Yeah, that’s one of the workarounds. Unfortunately until we find (and correct or mitigate) the issue within restic there’s only so much you can do. Sorry about that.