We use the rest-server in append-only mode. Clients need to authenticate.
To get gain from deduplication I’d like to have one common repo for all clients. But I’d like to restrict clients to re-read only the “host” they qualify for (that is, their own backup.) I understand, that this doesn’t match restic’s concept for all that dumb storages, but rest-server should be able to implement this.
Currently I’m thinking about a “hack” based on the authentication credentials (ugly) or based on client certificates (clean, but more admin effort).
What you want is much more than what rest-server is doing at the moment. To implement this, a middleware must be able to decrypt the repository and read basically all repository contents before giving data to a client. I don’t think this would be about enhancing rest-server but about implementing a whole new middleware component and a new clients.
Hm. So currently the rest-server is as dumb as possible and doesn’t know anything about the data the client sends or retrieves, does it?
I do not know much about the data structure, but I suppose, there is some “index”, “blocklist” or whatever, which contains the hashes-lists for the individual directory contents. If the rest-server would be able to “associcate” these lists with a client specific attribute, then it (the server) could restrict access, couldn’t it?
It doesn’t know and even cannot decrypt the data as it does not have the encryption key at hand. Your request would imply that a (IMO new) middleware saves the encryption key as it is done currently by the restic client.
The data stored falls into two categories, doesn’t it?
content blobs (subject to deduplication)
metadata blobs (file system trees, permissions, …)
The latter one is specific to the client. If client A want’s to access content of client B, it needs to know the blob IDs, client B has in use. Some of them may overlap with the content blobs of A, but that’s deduplication and ok.
If a can prevent A from reading the metadata blobs client B stored on the rest-server, I’d get some level of isolation, wouldn’t I?
The repo layout indicates some structure, IMHO access to the “index” is crucial (if index contains the what I suppose, the metadata).
I understand, that the encryption key is the same, and I do not want to change this. I’m talking about the server side (rest-server), if this server would be able to prevent access to data that I didn’t store, I’d be happy
I’ll read about the repo layout. (It reminds me a project I started in 2011, using Perl, it looks like poor-man’s Restic.)
This is not possible with the current design (well, unless you hand the repository key to the rest-server and reimplement large parts of the repo format there). If restic uploaded individual blobs to the rest-server, then it would be possible to grant access depending on which client uploaded them. However, a simple implementation of that would be a performance disaster. Requiring a network round trip for every small bunch of bytes is just too slow.
Therefore, restic uploads opaque pack files (a collection of blobs) to the rest-server. With the current design the rest-server has no idea at all which blobs are contained in a pack file. Thus, there is no way to only selectively allow access to certain parts within a pack file. To complicate things further, the deduplication happens on the client side based on the repository’s index. Thus, other clients would simply skip storing that blob again, such that the rest-server never sees the client uploading that blob.
In other words, this would require a significant redesign of the repository implementation in restic, a completely new server-side implementation and potentially changes in the repository format to work.
What you’re describing is pretty much what kopia does in server mode: The server holds the encryption keys and makes sure that users only have access to their respective subdirectory. I wrote up some ideas for an orchestration server with two main functions: Centralized configuration of clients and and to potentially proxy access to a (separate) storage location. I.e., you’d have credentials for one central repository but not want to share it with clients.