Backing up busy email server

I’ve been using a commercial backup application on my Linux servers (VMs actually) for some years now, with great success. It is file-based, dedupes, encrypts etc, and sends the data directly to B2 or S3 or whatever. Sadly, due to insanity at their management level, the annual cost of this software is going up by a factor of 10 and I simply can’t afford it.

I’ve therefore been going through the alternatives, and I narrowed things down to restic, which seems to me to be the bees knees in terms of almost every feature I need and then some. OK, so there’s no central GUI to select files for backup and restore on multiple servers like I currently have, but I can live without that. We almost never need to restore anything anyway.

I do have a concern about using restic on one particular server though, and I’m hoping someone with experience of backing up large amounts of data, consisting of millions of smaller files, could chime in to reassure me (or otherwise!) please?

For context, the servers I back up are multi-function, multi-account hosting servers. They run postfix email, apache web and mysql databases. My normal methodology is to use the built-in backup facility on the hosting control we use to create daily backups (daily incremental and weekly full) of all user account data, then use my file-based backup utility - soon to be restic - to essentially encrypt and copy these backups, along with the content of a few specific directories, to a cloud storage service like S3 or B2.

[A daily disk image snapshot of each system is also done, but that’s for disaster recovery not individual file recovery]

Surprisingly, at least to me, the dedupe on my old backup application, and on restic in testing, works even on the highly compressed backups created by the control panel so it all works very well for me. The magic of chunks, I suppose.

I have one server that’s a problem though. It has nearly 2TB of email on it, spread across many accounts. The control panel’s backup cannot cope with it, taking hours and hours and hours at high load average to backup every day, whether the backup was incremental or full.

The solution I came up with with was to exclude email from the control panel backups, leaving just web and databases and configs, and to use my old backup application to backup the contents of all the email directories as well as the control panel backup files. This worked, with a daily change of 6GB being sent to cloud storage, the entire backup process normally taking less than an hour, and with only moderate server load for a short period.

Naturally, I want to do the same using restic. I can think of no reason why restic would struggle or cause problems if the old backup worked perfectly well doing the same thing. But I’m a cautious person, and don’t like rushing into things without a second opinion. And clearly restic will use a different technical methodology to my existing backup and different chunk sizes by default etc etc.

So, does anyone here backup large quantities of email like this with restic? What is your experience? Are there any “gotchas” I should look out for? Obviously I can just go ahead and test it, and I will. But I would love some feedback first.

The disks on these VMs are all solid-state, local RAID6 - I don’t have IO performance worries under normal circumstances and io waits are normally negligible. There’s plenty of CPU grunt and RAM too, and I can always allocate more if need be.

Thanks,

F.

I don’t have any experience with the challenges you write about but there are quite a few threads in this forum about backing up large datasets, like this one. Especially CERN are worth mentioning as they do huge backups using restic, are discussed in multiple threads and have quite a few CERN internal presentations about the issue available online.

From what I have read so far, most trouble with huge backups comes from having too little RAM available which, as you write, should not be a problem in your case.

One potentially relevant question is how those emails are stored. restic currently requires a significant amount of memory for folder that contain 100k+ files (subfolders are not relevant).

With a lot of changing data you should probably create a file system snapshot and create the backup from that snapshot to avoid inconsistencies.

Thanks for the replies.

If subfolders aren’t relevant then there will easily be more than 100K files in total being backed up under the include path for the email directory :frowning:

@MichaelEischer when you say significant amount of memory, what sort of amounts are we talking about? The VM has 24Gb allocated to it, and for whatever it might be worth (not a lot?) “free” currently says 4.4Gb used, 1.5Gb shared (the rest of buffer/cache and there’s even some actually free).

So, in principal there’s some headroom available, but is it enough?

I do not think we can do filesystem snapshots within this system. It is ext4 and LVM isn’t used, and I can’t do anything with the snapshots made by the host node itself.

Is doing individual restic backups of each domain’s individual email folder a practical option? Iterating through them all is possibly possible, but there are 300+ and I’m thinking it could result in a very big and confusing mess.

Can you just try a backup and see if it works within a reasonable timeframe? The first run will take long and only show if it works at all with the 20 gigs of free RAM. The second backup a day later will show how long a “normal” backup will take.

Idea for consistency without snapshot: maybe try to make two backups right after each other. The first one grabs all new mails and will take longer. The second one just adds the mails that arrived during the first run. It’s not perfect but maybe perfect enough.

That sounds like you understood the exact opposite of what I wanted to say: the memory usage only depends to a limited amount on the total number of files. That is 1 million files spread across a few hundred folders are completely unproblematic. But 1 million files in a single folder will lead to a significant amount of memory usage in the range of 1-2GB RAM.

Other than that the memory requirements for recent restic versions can roughly be estimated as follows: 1GB RAM per 7 million unique files + 1GB RAM per 7TB data. Depending on your data set this can considerably overestimate the amount of necessary. prune will probably require about double the memory (that is until restic 0.17.0 is released).

2 Likes