> The user's notes must be unencrypted and readable as plain text on the server to create embeddings.
Consult a security expert before doing this, but here’s an idea: encrypt each word of the text, send the encrypted tokens over the wire, and then use an embedder trained on text encrypted with that method.
If you use an asymmetric encryption method, you could even throw away the private key.
The result still would be a substitution cypher on words, so it would not resist frequency analysis and it won’t help at all that, if your users manage to extract the key, they can encrypt text to figure out the mapping, but it would protect against people ‘accidentally’ looking at text of your users.
Periodically switching the encryption key wouldn’t be that hard.
> The user's notes must be unencrypted and readable as plain text on the server to create embeddings.
Consult a security expert before doing this, but here’s an idea: encrypt each word of the text, send the encrypted tokens over the wire, and then use an embedder trained on text encrypted with that method.
If you use an asymmetric encryption method, you could even throw away the private key.
The result still would be a substitution cypher on words, so it would not resist frequency analysis and it won’t help at all that, if your users manage to extract the key, they can encrypt text to figure out the mapping, but it would protect against people ‘accidentally’ looking at text of your users.
Periodically switching the encryption key wouldn’t be that hard.
Which embedding model are you using?
Perhaps pick one with lower memory usage from this list?
https://huggingface.co/spaces/mteb/leaderboard
https://stackoverflow.com/questions/190771/how-secure-is-sen...
Sorry, I should have phrased the last part of the problem better. I already use https.
The user's notes must be unencrypted and readable as plain text on the server to create embeddings. This defeats the purpose of end-to-end encryption.
I don’t understand the question fully but maybe you are looking for something like this?
https://aws.amazon.com/ec2/nitro/nitro-enclaves/
I’m not sure if there are implementations for browsers, but look into embeddings with homomorphic encryption.