Dynamic scrypt parameters depending on CPU performance (KDFTimeout)

tofran · January 31, 2021, 8:26pm

I have a few questions regarding how does restic generates the encryption keys, depending on the machine performance.

Background:

Restic dynamically adjusts script parameters based on the CPU performance:

// KDFTimeout specifies the maximum runtime for the KDF.
KDFTimeout = 500 * time.Millisecond

// KDFMemory limits the memory the KDF is allowed to use.
KDFMemory = 60

in internal/repository/key.go; introduced in #557

This means that restic will run a small benchmark and the encryption strength will depend on the performance of the machine.

For context there’s a great explanation by @fd0: Is storing key in the backup location really safe?

Right now as you can see the KDFTimeout and KDFMemory are hard-coded and cannot be configured.
Please correct me if I’m wrong.

Questions

What exactly does this change? What will be more resource intensive depending on this setting? The key decryption (when you introduce your password), the processing of all the files, or both?
Will the encryption strength of the repo be locked to the initial initialization parameters? Or if you change your password (restic key passwd) will it change anything? (Considering your machine performance changed)
If you run restic in a virtually slow machine, how easier will it be to break? Like, what order of magnitude comparing CPU/RAM?
Is there any lower/upper limits on this?
Why isn’t this configurable?

Why this question?

I want to learn a little more on how all this stuff works.
In one of my use cases I run restic in an extremely resource deficient container - backup speed is not important for me. And Would like to understand what It affects.

Thank you in advance

fd0 · February 2, 2021, 7:43pm

Welcome to the forum!

First, I have some (very small) corrections that I’d like to get out of the way , cryptography unfortunately requires being exact. I see that it can be very complicated and frustrating, please be aware that I see your effort to understand and I appreciate it a lot!

The keys used for encryption/signing are chosen at random, independently from the machine performance. What does change is the amount of computation required to get from a valid password to the low-level keys used to encrypt and sign the data in the repository using the Key Derivation Function (KDF).

It’s no the “encryption strength” that changes (the algorithm for encrypting and signing the data in the repo is always the same) but the amount of work (and therefore the time needed) to get from a password to the keys usde to encrypt/sign the data.

Correct. If you want to change them, you’ll have to rebuild restic (at least for now).

Now to your questions:

If you initialize the repo on a powerful machine, it’ll take roughly 500ms to get from a password to the encryption/signing keys. If you open the same repository on a low-end machine it may happen that you have to wait for several tens of seconds until the encryption/signing keys are available. If it’s the other way around, the low-end machine will open the repository in roughly 500ms, but the powerful machine can maybe do the same thing in 20ms or so. If you have multiple passwords, the worst case is that the low-end machine needs to try them sequentially, taking much longer overall.

The more computation required to do this the harder it is for attackers to guess passwords and try them in a brute-force approach, even with fast GPU. So we’ll try to make it as hard as possible for attackers, within reasonable limits.

As always, it’s a question of trade offs: we didn’t just hardcode some best-practice parameters for the scrypt KDF because that would not take advances in CPU power into account. The same version of restic released in 2015 may still be in use in 2025, so we wanted like to make sure that it uses as much computation as possible, while still being usable, so we decided to use 500ms as the maximum time which may be used for key derivation.

The files in the repository are always encrypted with AES and signed with an algorithm called Poly1305 (the details do not matter here). It’s the same algorithms whose performance is completely independent of how hard the key derivation step is. At the end of the key derivation, restic has the keys used for signing and encrypting the data in the repo (that was the hard step), everything after is as efficient as it gets.

The amount of computation is locked for each password independently when that password is created, the parameters for the scrypt KDF are written to a file in the keys/ subdir. If you change your password, the calibration (so that it takes 500ms at most) is run again, so if you switched to a more powerful machine and then change the password with restic key passwd the KDF will have stronger parameters and take much more work to derive the keys from the password.

That’s hard to answer without having all the details and doing a few tests with e.g. hashcat. In general, even on low-end machines, as long as your password is long and random (or even a complete pass phrase), it’s still very unrealistic for an attacker to brute-force your password. That’s what scrypt is designed to protect against, even with not-so-great parameters.

It’s rather the other way around that we’re mostly trying to protect users against: using a guessable password on a fast machine. If attackers can be restricted in how many passwords per second they can guess, it’s very hard for them to try a lot of passwords, which makes finding even bad passwords much harder.

As far as I remember the library we’re using to calibrate the scrypt KDF has a lower bound for parameters, it doesn’t get easier than that. But I’d have to check the code to be sure. There are upper limits of what scrypt can do in terms of computation, as far as I remember those are very very high.

Restic is an opinionated project, which means that not everything is configurable. I don’t mean to sound demeaning or so, but the typical user will not be able to understand all the (very subtle) details of what it means to configure the parameters used for scrypt or even the requirements for memory and computational complexity of the KDF. That’s usually not their area of expertise (and it does not need to be!). So we tried to build restic so that it is secure in most cases.

For the same reason, restic has exactly one algorithm for encryption and one algorithm for signing data. There have been multiple requests to change that and make it configurable (e.g. some people just don’t like AES), but the complexity overhead is not worth the perceived gain in usability.

Restic is easy to change and build, and that’s a nice signal for “you can change all you want, but make sure you know what you’re doing”

Hope this helps!

tofran · February 2, 2021, 10:55pm

@fd0 thank you so much for your explanation, it is very clear and hopefully will also be valuable for anyone reading.

I also totally agree that such open source project must be opinionated and cannot fulfill exactly everyone’s requirements.

I’m really thankful for your time a and dedication towards restic