Is storing key in the backup location really safe?

fd0 · August 21, 2019, 8:23am

Good question! I have answered this a few times already, but I keep forgetting where I wrote it down. So I’ll just answer it here again in a bit more general and simplified form so I can point people to it

The question is: Is restic’s practice of storing “key files” in the repository secure?

Background

Almost all data in a restic repository is encrypted with a master key. The master key is chosen randomly when the repository is initialized. The password entered at initialization time is used (together with the Key Derivation Function scrypt) to derive a key for that password. The master key is then encrypted for the key derived from the password, the encrypted master key (together with some other data needed for scrypt) is saved to a file in the keys/ subdir in the repo.

The construction is very similar to other solutions use (e.g. the Linux Unified Key Setup (LUKS) used for disk encryption on Linux).

You can see a sample key file in the repository documentation:

{
    "hostname": "kasimir",
    "username": "fd0"
    "kdf": "scrypt",
    "N": 65536,
    "r": 8,
    "p": 1,
    "created": "2015-01-02T18:10:13.48307196+01:00",
    "data": "tGwYeKoM0C4j4/9DFrVEmMGAldvEn/+iKC3te/QE/6ox/V4qz58FUOgMa0Bb1cIJ6asrypCx/Ti/pRXCPHLDkIJbNYd2ybC+fLhFIJVLCvkMS+trdywsUkglUbTbi+7+Ldsul5jpAj9vTZ25ajDc+4FKtWEcCWL5ICAOoTAxnPgT+Lh8ByGQBH6KbdWabqamLzTRWxePFoYuxa7yXgmj9A==",
    "salt": "uW4fEI1+IOzj7ED9mVor+yTSJFd68DGlGOeLgJELYsTU5ikhG/83/+jGd4KKAaQdSrsfzrdOhAMftTSih5Ux6w==",
}

This JSON document is stored in the repository as it is. The field data contains the encrypted master key, the other fields are either meta data (like hostname) and there’s an issue to remove them.

When a second password is added, the master key is decrypted with the existing password and then encrypted again with the key derived from the new password in a new key file.

When restic is run and the user supplies a password, restic downloads all the files in the keys subdir in the repository. For each file it then derives the password key by running scrypt with the password and the parameters from the file, and then tries to decrypt the data field. If that works, the password is correct and restic can decrypt all other content stored in the repository with the master key.

Analysis

So, let’s analyse this construction from the perspective of attackers who gained access to the files in the repository (say, by compromising the server the restic repository is stored on via sftp), but they don’t have a valid password for the repo.

They can now read all files stored in the repo, most of which are encrypted. The only unencrypted files are the files in the keys/ directory. Let’s say this particular repo has two passwords, therefore there are two files stored in keys/.

The attackers can see the meta data fields (that’s why we plan to remove them), who created which password on what host. And they can read all the data used to run scrypt with. The KDF scrypt does not only require a lot of CPU for each run, it is also designed to need a lot of memory.

For restic, we’ve configured that each run needs at least 60 MiB of RAM and about 500ms time (see here). This means that if attackers have a machine that has the same CPU power as the host restic runs on, they can try two passwords per second per core.

If attackers use a machine with a GPU, they are still limited by the memory on the GPU. Let’s say the GPU has 16GiB of memory, then they can still only test ~273 passwords in parallel (since each needs 60MiB) of memory. That number’s quite low.

I’ve tried to run hashcat (a program typically used to break password hashes on GPUs) with the scrypt parameters mentioned above on a machine with two NVidia GPUs, but it failed to even start computing hashes. I’m not sure what went wrong.

If you have a sufficiently long password (say, 16 characters), I think it’s not realistic to find the password even with a high-powered GPU-based cracking machine.

Trade-Offs

The design of the repository format and restic itself contains some trade-offs.

While it would be possible to store the key file not in the repository, but only on the local machine, but we decided against that to improve robustness. The risk that attackers are able to find the password needs to be balanced against users losing access to their backups because the key file is only kept locally and the SSD failed, then everything is lost.

We also try to keep restic’s complexity under control, so it does currently offer keeping the key file local as an option.

I hope this answers your question!