Questions and thoughts on restic security - revocable keys and cryptoperiods

Disclaimer: I am not a security expert.

Question: What is the purpose & concept of the revocable user key in restic? (restic key remove)?

If I understand correctly, user key revocation doesn’t prevent users with the revoked password from accessing the data inside the repository (because they already obtained the repo encryption key with restic cat masterkey before their key was revoked).

Perhaps the goal is to prevent a user from accessing a repository with an unmodified client? This doesn’t seem very valuable - it is trivial to overcome as the client is open source. Probably a 1-line change to set the encryption key to a constant instead of retrieving it from the repository.

Further thoughts:

  • Restic seems to have a really nice implementation of data-at-rest-encryption, key-wrapping keys, and data integrity verification.

  • But if I understand correctly, Restic does not support concept of cryptoperiods (NIST 800-57, section 5.3) - the ability to re-encrypt data within the repository when needed. Like when a key leaks, a user leaves the organization, or technology improves sufficiently to make old keys crackable.

  • For applications like restic (symmetric data encryption, large volumes, originator-usage, key-wrapping keys), I believe NIST recommends cryptoperiods of < 2 years. (5.3.6)

  • If restic supported re-encryption, the restic key remove revocation function would make a lot more sense. I think restic could support either of two use cases for key revocation:

    1. Preventing users with revoked passwords from accessing the repository at all (by re-encrypting the entire repository to a new encryption key which is not accessible with the revoked password). Note that these users may have taken a copy of the repository while they had access.
    2. Preventing users with revoked passwords from accessing new data in the repository (added after their key was revoked). This would not require re-encryption of old content, but would require the repository to support multiple encryption keys. New data would be encrypted with a new encryption key. Which would mean that every blob in the repository would require a pointer to its encryption key.

Note that implementing #2 above would also allow old data to be protected either fully or partially by migrating old blobs to the new key. It would also allow re-encryption to occur over time - re-encryption could be performed blob-by-blob instead of the entire repo at once.

Am I thinking about this correctly? Is this a valuable feature request?

1 Like

You’re right, restic does not offer to re-encrypt the whole repository with a new set of master keys. In the situation where at some point in time attackers acquired the master keys for a repository and afterwards retain access to the files in the repository, you cannot lock them out by e.g. changing or removing passwords. Adding an option to re-encrypt data will add significant complexity.

We’ve discussed a master key change in #320 and came to the conclusion that you can achieve your goal with a copy command (which copies all data from one repo to another one, with a new master key), this is discussed in #323. An implementation is not yet available though.

When the master key is changed, all key files need to be changed, which requires having all passwords for the repo (the master key is encrypted with a key derived from each password). That’s a major inconvenience, we cannot easily implement this on restic key remove unfortunately.

In practice, I don’t think users will find themselves in a situation where a master key needs to be changed very often. Suppose a user discoveres that attackers got access to the files in the repo as well as the repo password. Then all data in the repo may have already been downloaded and is compromised. The first thing that users will then probably do is disable access to the repo to attackers, so it is not strictly necessary (albeit nice to have) to change the master key.

Let me start by saying that I began looking at the crypto for this system because I wanted to contribute to issue #2409, and so I wanted to understand the crypto before I documented it.

My understanding of how you would like to approach 2409 proceeds from the following two assumptions:

  • You want the documentation to be honest about the crypto implementation, especially any known limitations that might impact restic users
  • The threat model in the design document accurately describes the goals of the crypto implementation

Currently, the design documentation currently says:

The restic backup program guarantees…accessing the unencrypted content of stored files and metadata should not be possible without a password for the repository

But it does not mention that once a password is shared with a user, repository access cannot be revoked from that user or anyone he shares the key with. This should be explicitly mentioned, especially since the restic key remove function implies security.

The first thing that users will then probably do is disable access to the repo to attackers, so it is not strictly necessary (albeit nice to have) to change the master key

This seems contradictory to the restic threat model, which states that the “design goals for restic include being able to securely store backups in a location that is not completely trusted, e.g. a shared system where others can potentially access the files”

When the master key is changed, all key files need to be changed, which requires having all passwords for the repo (the master key is encrypted with a key derived from each password). That’s a major inconvenience, we cannot easily implement this on restic key remove unfortunately.

Indeed, I recognize this and agree that it is a difficult problem to solve with the current architecture. What would be the proper design of a system that intended to support such a capability?

Would it be something like the following?

  • User keys would need to be asymmetric instead of symmetric. The public key is stored unprotected in the repo, the user stores their private key externally
  • The repository supports multiple master encryption keys which are used on a blob-by-blob basis
  • When a new master encryption key is created, it is issued to each user by storing it in that user’s keyfile, encrypted with that user’s public key
  • New blobs are always written with the latest key.
  • Old blobs can be migrated from old keys to the newest key

Sorry for my ignorance, but I don’t get this. If you have a repo with a key (not master key, but “password” key), and share that key with another user, then obviously that other user can access the data in the repo (e.g. restore from it). But if you then remove that key (using restic key remove), then that other user will no longer be able to access the data in the repo (unencrypted, obviously). Simply because he/she no longer has a key to use to access the repo.

What am I missing?

Each user’s key is just the master repo key, encrypted with their passphrase. At any time, the user could copy their key locally or decrypt the master key with their passphrase and store that locally. Then, even if their key is removed from the repository, if they still have read access to the repository they can continue to read new data that is added to the repository.

1 Like

But if you then remove that key (using restic key remove ), then that other user will no longer be able to access the data in the repo (unencrypted, obviously). Simply because he/she no longer has a key to use to access the repo.

Just to reinforce and amplify the entirely-correct response from cdhowie:

Your password allows you to decrypt the ‘master key’ to the repository (which you can see in base64-encoded plaintext with restic cat masterkey). Once you have the decrypted master key, you could make a small change to the client code that would allow it to read the entire repository without needing a password at all.

Revoking your password prevents you from decrypting the master key. But presumably you decrypted it while you had a password, and you kept a copy of it, so revoking your key doesn’t really improve security.

These types of situations (users with undesired access to keys) are common and unavoidable in crypto systems, which is one of the reasons that cryptoperiod support is a fundamental concept of cryptography. Cryptoperiod is the concept that keys should work for a period of time, after which data is re-encrypted with a new key, thereby preventing users with old keys from decrypting it.

Even with perfect key management (never leaking a password), it’s a fact of life that technology progresses and crypto becomes steadily weaker over time. For these reasons, NIST deals heavily with cryptoperiods in their key management standards, with differing recommendations for different types of systems.

I reviewed the crypto implementation in restic as I prepared for working on #2409. I also read some details from other (more expert) reviews, and I am pretty impressed with the implementation. Key management is notoriously difficult, and restic’s designers made a lot of good decisions.

But I think it’s important for users to understand that restic key remove doesn’t really improve security, and that some of the statements in the documentation are likely to mislead users into thinking their repository is more secure than it actually is. (For example, you had such an impression when you wrote your message above!)

In the big picture, I think there are two ways to proceed:

  1. Change restic to make it align with the statements in the documentation
  2. Change the documentation to make it align with restic’s implementation

(ignoring option 3: do nothing)

I could get behind either approach (although, obviously, end users would prefer a stronger crypto system)

The primary problem to tackle in restic is when to do this, and how to do it concurrent with other processes (because it will take a very long time). A great time to do this would be when prune is rewriting packs to remove unreferenced objects – it can simply write out the new packs using a new master key. This would be effectively free.

A secondary problem is that re-encrypting everything necessarily means retrieving everything from storage. This can be an issue for repositories stored in cloud storage systems that charge egress fees. If a repository is stored in S3, it might be substantially cheaper to run this process on an EC2 instance since the data would never leave AWS, but other object storage systems like B2 don’t lease on-premise compute power so the egress fees would be unavoidable.

In cases where the repository is kept locally and synced to object storage, cloud storage egress fees aren’t an issue, but the entire repository has to be re-uploaded. For larger repositories, this could easily take months. Depending on the implementation, you may also need enough storage capacity to store two copies of the repository (or at least two copies of every pack that needs to be rewritten) but probably only if a shared lock is used. (An exclusive lock means no concurrent access, and the rewrite process could delete old packs immediately after they are re-encrypted with the new key.)

IMO we should immediately update the documentation, then figure out what the implementation would look like.

1 Like

The primary problem to tackle in restic is when to do this, and how to do it concurrent with other processes (because it will take a very long time)

It need not be done all at once. I propose to add a new key, but retain the old one as well. Blobs can be migrated bit by bit, over time.

A great time to do this would be when prune is rewriting packs

I agree. Re-encryption could be an option for prune, but it could also be a standalone command like restic re-encrypt snapshot 0e2448f or restic re-encrypt pack xxx

(many concerns about the potential cost, effort and time required to re-encrypt the entire repository)

  1. Cost and effort are always issues with re-encryption, and they must be balanced against the need for security. That decision cannot be made by the developer, only the end user.
  2. Again, I am suggesting a solution that allows multiple encryption keys within the repository, so the user is not FORCED to re-encrypt old data…new data will be protected for free, and the user can decide how much old data is worth the cost to protect it. He can decide what data to re-encrypt, and what data to leave potentially exposed.

You’re right, the design document simplifies here. Thanks for pointing that out.

Again, a simplification. What I meant when I wrote the document was that restic enables storing data on a system that is not completely trusted and nobody is able to access the data if they weren’t given access. Or something along those lines. I think you see where I’m coming from :slight_smile:

You list some great solutions, but they all have in common that complexity is added to restic and the repository format. I’m not judging, I’m just pointing it out.

Oh, thanks for the praise! Just to give you a bit of background on me: I’ve been working as a penetration tester for the last decade or so, so I’m regularly breaking stuff in my dayjob. That has helped tremendously when designing restic, of course. I’m not an expert on cryptography, but I understand the math and I’m regularly breaking real-life crypto systems, so I’m familiar with implementation issues and errors. And I fully agree with you: key management is a major headache :slight_smile:

I can see your point. In hindsight, I regret some aspects of how restic is built, this is one of them. I think it would be great if we could improve the documentation and the CLI so that it becomes clearer. People regularly don’t get it until you explain the gritty details to them (which should not be necessary). I took a lot of inspiration from existing crypto systems, especially LUKS for Linux, which works exactly the same way as restic.

In my opinion, we should improve the documentation, especially the threat model section. In a second step we can then discuss if we want to extend restic (and thereby increasing its complexity) or if we want to live with restic’s limitations.

2 Likes

100% agree that these would probably add substantial complexity to the application, and likely to the repository format

I’d be curious to hear your preliminary thoughts on what types of repo format changes might be required to support multiple encryption keys, assuming the goal of minimizing changes to the repository format, and preserving backward compatibility as much as possible.

I’m insufficiently experienced with restic to make good guesses here, but I would assume:

  • Keyfiles would require an array or hashmap of keys. (Would need to be asymmetric to allow centralized publication of new keys)
  • Keyfiles would need a new attribute for public key (to allow encryption of new keys for each user)
  • Index files would require each blob to have a key attribute
  • Packfiles would also need to link each blob to a key.
    • It might be possible to kludge an ugly approach that doesn’t require changes to the packfile format by using new BlobTypes to link blobs to keys (e.g. Blob type 2 = data_encrypted_with_key2, type 3 = data_encrypted_with_key3) but that’s pretty ugly and it limits the number of keys available.
    • More likely, it would require changes the Packfile header format to something like:
Type_Blob1 || Key_Blob1 || Length_Blob1 || Hash_Blob1 
[...]
Type_BlobN || Key_BlobN || Length_BlobN || Hash_BlobN

(I might also recommend that Packfiles have a version attribute added to them so that future format changes don’t break backward compatibility. Something like this:

EncryptedBlob1 || ... || EncryptedBlobN || EncryptedHeader || Header_Length || Packfile_Format_Version

The version could alternately be placed at the beginning of the header

My comments were heartfelt, and I hope that my concerns didn’t sound like attacks. I’m a very happy restic user, and I’d like to contribute back. I’ve always thought that pen testing would be an interesting and peculiar job: Half excited fun and half excruciating detail.

I’ll take a stab at this and submit via a PR

2 Likes