Search
Items tagged with: symmetricEncryption
Governments are back on their anti-encryption bullshit again.
Between the U.S. Senate’s “EARN IT” Act, the E.U.’s slew of anti-encryption proposals, and Australia’s new anti-encryption law, it’s become clear that the authoritarians in office view online privacy as a threat to their existence.
Normally, when the governments increase their anti-privacy sabre-rattling, technologists start talking more loudly about Tor, Signal, and other privacy technologies (usually only to be drowned out by paranoid people who think Tor and Signal are government backdoors or something stupid; conspiracy theories ruin everything!).
I’m not going to do that.
Instead, I’m going to show you how to add end-to-end encryption to any communication software you’re developing. (Hopefully, I’ll avoid making any bizarre design decisions along the way.)
But first, some important disclaimers:
- Yes, you should absolutely do this. I don’t care how banal your thing is; if you expect people to use it to communicate with each other, you should make it so that you can never decrypt their communications.
- You should absolutely NOT bill the thing you’re developing as an alternative to Signal or WhatsApp.
- The goal of doing this is to increase the amount of end-to-end encryption deployed on the Internet that the service operator cannot decrypt (even if compelled by court order) and make E2EE normalized. The goal is NOT to compete with highly specialized and peer-reviewed privacy technology.
- I am not a lawyer, I’m some furry who works in cryptography. The contents of this blog post is not legal advice, nor is it endorsed by any company or organization. Ask the EFF for legal questions.
The organization of this blog post is as follows: First, I’ll explain how to encrypt and decrypt data between users, assuming you have a key. Next, I’ll explain how to build an authenticated key exchange and a ratcheting protocol to determine the keys used in the first step. Afterwards, I’ll explore techniques for binding authentication keys to identities and managing trust. Finally, I’ll discuss strategies for making it impractical to ever backdoor your software (and impossible to silently backdoor it), just to piss the creeps and tyrants of the world off even more.
You don’t have to implement the full stack of solutions to protect users, but the further you can afford to go, the safer your users will be from privacy-invasive policing.
(Art by Kyume.)
Preliminaries
Choosing a Cryptography Library
In the examples contained on this page, I will be using the Sodium cryptography library. Specifically, my example code will be written with the Sodium-Plus library for JavaScript, since it strikes a good balance between performance and being cross-platform.
const { SodiumPlus } = require('sodium-plus');(async function() { // Select a backend automatically const sodium = await SodiumPlus.auto(); // Do other stuff here})();
Libsodium is generally the correct choice for developing cryptography features in software, and is available in most programming languages,
If you’re prone to choose a different library, you should consult your cryptographer (and yes, you should have one on your payroll if you’re doing things different) about your design choices.
Threat Modelling
Remember above when I said, “You don’t have to implement the full stack of solutions to protect users, but the further you can afford to go, the safer your users will be from privacy-invasive policing”?
How far you go in implementing the steps outlined on this blog post should be informed by a threat model, not an ad hoc judgment.
For example, if you’re encrypting user data and storing it in the cloud, you probably want to pass the Mud Puddle Test:
1. First, drop your device(s) in a mud puddle.
2. Next, slip in said puddle and crack yourself on the head. When you regain consciousness you’ll be perfectly fine, but won’t for the life of you be able to recall your device passwords or keys.
3. Now try to get your cloud data back.Did you succeed? If so, you’re screwed. Or to be a bit less dramatic, I should say: your cloud provider has access to your ‘encrypted’ data, as does the government if they want it, as does any rogue employee who knows their way around your provider’s internal policy checks.
Matthew Green describes the Mud Puddle Test, which Apple products definitely don’t pass.
If you must fail the Mud Puddle Test for your users, make sure you’re clear and transparent about this in the documentation for your product or service.
(Art by Swizz.)
I. Symmetric-Key Encryption
The easiest piece of this puzzle is to encrypt data in transit between both ends (thus, satisfying the loosest definition of end-to-end encryption).
At this layer, you already have some kind of symmetric key to use for encrypting data before you send it, and for decrypting it as you receive it.
For example, the following code will encrypt/decrypt strings and return hexadecimal strings with a version prefix.
const VERSION = "v1";/** * @param {string|Uint8Array} message * @param {Uint8Array} key * @param {string|null} assocData * @returns {string} */async function encryptData(message, key, assocData = null) { const nonce = await sodium.randombytes_buf(24); const aad = JSON.stringify({ 'version': VERSION, 'nonce': await sodium.sodium_bin2hex(nonce), 'extra': assocData }); const encrypted = await sodium.crypto_aead_xchacha20poly1305_ietf_encrypt( message, nonce, key, aad ); return ( VERSION + await sodium.sodium_bin2hex(nonce) + await sodium.sodium_bin2hex(encrypted) );}/** * @param {string|Uint8Array} message * @param {Uint8Array} key * @param {string|null} assocData * @returns {string} */async function decryptData(encrypted, key, assocData = null) { const ver = encrypted.slice(0, 2); if (!await sodium.sodium_memcmp(ver, VERSION)) { throw new Error("Incorrect version: " + ver); } const nonce = await sodium.sodium_hex2bin(encrypted.slice(2, 50)); const ciphertext = await sodium.sodium_hex2bin(encrypted.slice(50)); const aad = JSON.stringify({ 'version': ver, 'nonce': encrypted.slice(2, 50), 'extra': assocData }); const plaintext = await sodium.crypto_aead_xchacha20poly1305_ietf_decrypt( ciphertext, nonce, key, aad ); return plaintext.toString('utf-8');}
Under-the-hood, this is using XChaCha20-Poly1305, which is less sensitive to timing leaks than AES-GCM. However, like AES-GCM, this encryption mode doesn’t provide message- or key-commitment.
If you want key commitment, you should derive two keys from $key
using a KDF based on hash functions: One for actual encryption, and the other as a key commitment value.
If you want message commitment, you can use AES-CTR + HMAC-SHA256 or XChaCha20 + BLAKE2b-MAC.
If you want both, ask Taylor Campbell about his BLAKE3-based design.
A modified version of the above code with key-commitment might look like this:
const VERSION = "v2";/** * Derive an encryption key and a commitment hash. * @param {CryptographyKey} key * @param {Uint8Array} nonce * @returns {{encKey: CryptographyKey, commitment: Uint8Array}} */async function deriveKeys(key, nonce) { const encKey = new CryptographyKey(await sodium.crypto_generichash( new Uint8Array([0x01].append(nonce)), key )); const commitment = await sodium.crypto_generichash( new Uint8Array([0x02].append(nonce)), key ); return {encKey, commitment};}/** * @param {string|Uint8Array} message * @param {Uint8Array} key * @param {string|null} assocData * @returns {string} */async function encryptData(message, key, assocData = null) { const nonce = await sodium.randombytes_buf(24); const aad = JSON.stringify({ 'version': VERSION, 'nonce': await sodium.sodium_bin2hex(nonce), 'extra': assocData }); const {encKey, commitment} = await deriveKeys(key, nonce); const encrypted = await sodium.crypto_aead_xchacha20poly1305_ietf_encrypt( message, nonce, encKey, aad ); return ( VERSION + await sodium.sodium_bin2hex(nonce) + await sodium.sodium_bin2hex(commitment) + await sodium.sodium_bin2hex(encrypted) );}/** * @param {string|Uint8Array} message * @param {Uint8Array} key * @param {string|null} assocData * @returns {string} */async function decryptData(encrypted, key, assocData = null) { const ver = encrypted.slice(0, 2); if (!await sodium.sodium_memcmp(ver, VERSION)) { throw new Error("Incorrect version: " + ver); } const nonce = await sodium.sodium_hex2bin(encrypted.slice(2, 50)); const ciphertext = await sodium.sodium_hex2bin(encrypted.slice(114)); const aad = JSON.stringify({ 'version': ver, 'nonce': encrypted.slice(2, 50), 'extra': assocData }); const storedCommitment = await sodium.sodium_hex2bin(encrypted.slice(50, 114)); const {encKey, commitment} = await deriveKeys(key, nonce); if (!(await sodium.sodium_memcmp(storedCommitment, commitment))) { throw new Error("Incorrect commitment value"); } const plaintext = await sodium.crypto_aead_xchacha20poly1305_ietf_decrypt( ciphertext, nonce, encKey, aad ); return plaintext.toString('utf-8');}
Another design choice you might make is to encode ciphertext with base64 instead of hexadecimal. That doesn’t significantly alter the design here, but it does mean your decoding logic has to accommodate this.
You SHOULD version your ciphertexts, and include this in the AAD provided to your AEAD encryption mode. I used “v1” and “v2” as a version string above, but you can use your software name for that too.
II. Key Agreement
If you’re not familiar with Elliptic Curve Diffie-Hellman or Authenticated Key Exhcanges, the two of the earliest posts on this blog were dedicated to those topics.
Key agreement in libsodium uses Elliptic Curve Diffie-Hellman over Curve25519, or X25519 for short.
There are many schools of thought for extending ECDH into an authenticated key exchange protocol.
We’re going to implement what the Signal Protocol calls X3DH instead of doing some interactive EdDSA + ECDH hybrid, because X3DH provides cryptographic deniability (see this section of the X3DH specification for more information).
For the moment, I’m going to assume a client-server model. That may or may not be appropriate for your design. You can substitute “the server” for “the other participant” in a peer-to-peer configuration.
Head’s up: This section of the blog post is code-heavy.
Update (November 23, 2020): I implemented this design in TypeScript, if you’d like something tangible to work with. I call my library, Rawr X3DH.
X3DH Pre-Key Bundles
Each participant will need to upload an Ed25519 identity key once (which is a detail covered in another section), which will be used to sign bundles of X25519 public keys to use for X3DH.
Your implementation will involve a fair bit of boilerplate, like so:
/** * Generate an X25519 keypair. * * @returns {{secretKey: X25519SecretKey, publicKey: X25519PublicKey}} */async function generateKeyPair() { const keypair = await sodium.crypto_box_keypair(); return { secretKey: await sodium.crypto_box_secretkey(keypair), publicKey: await sodium.crypto_box_publickey(keypair) };}/** * Generates some number of X25519 keypairs. * * @param {number} preKeyCount * @returns {{secretKey: X25519SecretKey, publicKey: X25519PublicKey}[]} */async function generateBundle(preKeyCount = 100) { const bundle = []; for (let i = 0; i < preKeyCount; i++) { bundle.push(await generateKeyPair()); } return bundle;}/** * BLAKE2b( len(PK) | PK_0, PK_1, ... PK_n ) * * @param {X25519PublicKey[]} publicKeys * @returns {Uint8Array} */async function prehashPublicKeysForSigning(publicKeys) { const hashState = await sodium.crypto_generichash_init(); // First, update the state with the number of public keys const pkLen = new Uint8Array([ (publicKeys.length >>> 24) & 0xff, (publicKeys.length >>> 16) & 0xff, (publicKeys.length >>> 8) & 0xff, publicKeys.length & 0xff ]); await sodium.crypto_generichash_update(hashState, pkLen); // Next, update the state with each public key for (let pk of publicKeys) { await sodium.crypto_generichash_update( hashState, pk.getBuffer() ); } // Return the finalized BLAKE2b hash return await sodium.crypto_generichash_final(hashState);}/** * Signs a bundle. Returns the signature. * * @param {Ed25519SecretKey} signingKey * @param {X25519PublicKey[]} publicKeys * @returns {Uint8Array} */async function signBundle(signingKey, publicKeys) { return sodium.crypto_sign_detached( await prehashPublicKeysForSigning(publicKeys), signingKey );}/** * This is just so you can see how verification looks. * * @param {Ed25519PublicKey} verificationKey * @param {X25519PublicKey[]} publicKeys * @param {Uint8Array} signature */async function verifyBundle(verificationKey, publicKeys, signature) { return sodium.crypto_sign_verify_detached( await prehashPublicKeysForSigning(publicKeys), verificationKey, signature );}
This boilerplate exists just so you can do something like this:
/** * Generate some number of X25519 keypairs. * Persist the bundle. * Sign the bundle of publickeys with the Ed25519 secret key. * Return the signed bundle (which can be transmitted to the server.) * * @param {Ed25519SecretKey} signingKey * @param {number} numKeys * @returns {{signature: string, bundle: string[]}} */async function x3dh_pre_key(signingKey, numKeys = 100) { const bundle = await generateBundle(numKeys); const publicKeys = bundle.map(x => x.publicKey); const signature = await signBundle(signingKey, publicKeys); // This is a stub; how you persist it is app-specific: persistBundleNotDefinedHere(signingKey, bundle); // Hex-encode all the public keys const encodedBundle = []; for (let pk of publicKeys) { encodedBundle.push(await sodium.sodium_bin2hex(pk.getBuffer())); } return { 'signature': await sodium.sodium_bin2hex(signature), 'bundle': encodedBundle };}
And then you can drop the output of x3dh_pre_key(secretKey)
into a JSON-encoded HTTP request.
In accordance to Signal’s X3DH spec, you want to use x3dh_pre_key(secretKey, 1)
to generate the “signed pre-key” bundle and x3dn_pre_key(secretKey, 100)
when pushing 100 one-time keys to the server.
X3DH Initiation
This section conforms to the Sending the Initial Message section of the X3DH specification.
When you initiate a conversation, the server should provide you with a bundle containing:
- Your peer’s Identity key (an Ed25519 public key)
- Your peer’s current Signed Pre-Key (an X25519 public key)
- (If any remain unburned) One of your key’s One-Time Keys (an X25519 public key) — and then delete it
If we assume the structure of this response looks like this:
{ "IdentityKey": "...", "SignedPreKey": { "Signature": "..." "PreKey": "..." }, "OneTimeKey": "..." // or NULL}
Then we can write the initiation step of the handshake like so:
/** * Get SK for initializing an X3DH handshake * * @param {object} r -- See previous code block * @param {Ed25519SecretKey} senderKey */async function x3dh_initiate_send_get_sk(r, senderKey) { const identityKey = new Ed25519PublicKey( await sodium.sodium_hex2bin(r.IdentityKey) ); const signedPreKey = new X25519PublicKey( await sodium.sodium_hex2bin(r.SignedPreKey.PreKey) ); const signature = await sodium.sodium_hex2bin(r.SignedPreKey.Signature); // Check signature const valid = await verifyBundle(identityKey, [signedPreKey], signature); if (!valid) { throw new Error("Invalid signature"); } const ephemeral = await generateKeyPair(); const ephSecret = ephemeral.secretKey; const ephPublic = ephemeral.publicKey; // Turn the Ed25519 keys into X25519 keys for X3DH: const senderX = await sodium.crypto_sign_ed25519_sk_to_curve25519(senderKey); const recipientX = await sodium.crypto_sign_ed25519_pk_to_curve25519(identityKey); // See the X3DH specification to really understand this part: const DH1 = await sodium.crypto_scalarmult(senderX, signedPreKey); const DH2 = await sodium.crypto_scalarmult(ephSecret, recipientX); const DH3 = await sodium.crypto_scalarmult(ephSecret, signedPreKey); let SK; if (r.OneTimeKey) { let DH4 = await sodium.crypto_scalarmult( ephSecret, new X25519PublicKey(await sodium.sodium_hex2bin(r.OneTimeKey)) ); SK = kdf(new Uint8Array( [].concat(DH1.getBuffer()) .concat(DH2.getBuffer()) .concat(DH3.getBuffer()) .concat(DH4.getBuffer()) )); DH4.wipe(); } else { SK = kdf(new Uint8Array( [].concat(DH1.getBuffer()) .concat(DH2.getBuffer()) .concat(DH3.getBuffer()) )); } // Wipe keys DH1.wipe(); DH2.wipe(); DH3.wipe(); ephSecret.wipe(); senderX.wipe(); return { IK: identityKey, EK: ephPublic, SK: SK, OTK: r.OneTimeKey // might be NULL };}/** * Initialize an X3DH handshake * * @param {string} recipientIdentity - Some identifier for the user * @param {Ed25519SecretKey} secretKey - Sender's secret key * @param {Ed25519PublicKey} publicKey - Sender's public key * @param {string} message - The initial message to send * @returns {object} */async function x3dh_initiate_send(recipientIdentity, secretKey, publicKey, message) { const r = await get_server_response(recipientIdentity); const {IK, EK, SK, OTK} = await x3dh_initiate_send_get_sk(r, secretKey); const assocData = await sodium.sodium_bin2hex( new Uint8Array( [].concat(publicKey.getBuffer()) .concat(IK.getBuffer()) ) ); /* * We're going to set the session key for our recipient to SK. * This might invoke a ratchet. * * Either SK or the output of the ratchet derived from SK * will be returned by getEncryptionKey(). */ await setSessionKey(recipientIdentity, SK); const encrypted = await encryptData( message, await getEncryptionKey(recipientIdentity), assocData ); return { "Sender": my_identity_string, "IdentityKey": await sodium.sodium_bin2hex(publicKey), "EphemeralKey": await sodium.sodium_bin2hex(EK), "OneTimeKey": OTK, "CipherText": encrypted };}
We didn’t define setSessionKey()
or getEncryptionKey()
above. It will be covered later.
X3DH – Receiving an Initial Message
This section implements the Receiving the Initial Message section of the X3DH Specification.
We’re going to assume the structure of the request looks like this:
{ "Sender": "...", "IdentityKey": "...", "EphemeralKey": "...", "OneTimeKey": "...", "CipherText": "..."}
The code to handle this should look like this:
/** * Handle an X3DH initiation message as a receiver * * @param {object} r -- See previous code block * @param {Ed25519SecretKey} identitySecret * @param {Ed25519PublicKey} identityPublic * @param {Ed25519SecretKey} preKeySecret */async function x3dh_initiate_recv_get_sk( r, identitySecret, identityPublic, preKeySecret) { // Decode strings const senderIdentityKey = new Ed25519PublicKey( await sodium.sodium_hex2bin(r.IdentityKey), ); const ephemeral = new X25519PublicKey( await sodium.sodium_hex2bin(r.EphemeralKey), ); // Ed25519 -> X25519 const senderX = await sodium.crypto_sign_ed25519_pk_to_curve25519(senderIdentityKey); const recipientX = await sodium.crypto_sign_ed25519_sk_to_curve25519(identitySecret); // See the X3DH specification to really understand this part: const DH1 = await sodium.crypto_scalarmult(preKeySecret, senderX); const DH2 = await sodium.crypto_scalarmult(recipientX, ephemeral); const DH3 = await sodium.crypto_scalarmult(preKeySecret, ephemeral); let SK; if (r.OneTimeKey) { let DH4 = await sodium.crypto_scalarmult( await fetchAndWipeOneTimeSecretKey(r.OneTimeKey), ephemeral ); SK = kdf(new Uint8Array( [].concat(DH1.getBuffer()) .concat(DH2.getBuffer()) .concat(DH3.getBuffer()) .concat(DH4.getBuffer()) )); DH4.wipe(); } else { SK = kdf(new Uint8Array( [].concat(DH1.getBuffer()) .concat(DH2.getBuffer()) .concat(DH3.getBuffer()) )); } // Wipe keys DH1.wipe(); DH2.wipe(); DH3.wipe(); recipientX.wipe(); return { Sender: r.Sender, SK: SK, IK: senderIdentityKey };}/** * Initiate an X3DH handshake as a recipient * * @param {object} req - Request object * @returns {string} - The initial message */async function x3dh_initiate_recv(req) { const {identitySecret, identityPublic} = await getIdentityKeypair(); const {preKeySecret, preKeyPublic} = await getPreKeyPair(); const {Sender, SK, IK} = await x3dh_initiate_recv_get_sk( req, identitySecret, identityPublic, preKeySecret, preKeyPublic ); const assocData = await sodium.sodium_bin2hex( new Uint8Array( [].concat(IK.getBuffer()) .concat(identityPublic.getBuffer()) ) ); try { await setSessionKey(senderIdentity, SK); return decryptData( req.CipherText, await getEncryptionKey(senderIdentity), assocData ); } catch (e) { await destroySessionKey(senderIdentity); throw e; }}
And with that, you’ve successfully implemented X3DH and symmetric encryption in JavaScript.
We abstracted some of the details away (i.e. kdf()
, the transport mechanisms, the session key management mechanisms, and a few others). Some of them will be highly specific to your application, so it doesn’t make a ton of sense to flesh them out.
One thing to keep in mind: According to the X3DH specification, participants should regularly (e.g. weekly) replace their Signed Pre-Key in the server with a fresh one. They should also publish more One-Time Keys when they start to run low.
If you’d like to see a complete reference implementation of X3DH, as I mentioned before, Rawr-X3DH implements it in TypeScript.
Session Key Management
Using X3DH to for every message is inefficient and unnecessary. Even the Signal Protocol doesn’t do that.
Instead, Signal specifies a Double Ratchet protocol that combines a Symmetric-Key Ratchet on subsequent messages, and a Diffie-Hellman-based ratcheting protocol.
Signal even specifies integration guidelines for the Double Ratchet with X3DH.
It’s worth reading through the specification to understand their usages of Key-Derivation Functions (KDFs) and KDF Chains.
Although it is recommended to use HKDF as the Signal protocol specifies, you can strictly speaking use any secure keyed PRF to accomplish the same goal.
What follows is an example of a symmetric KDF chain that uses BLAKE2b with 512-bit digests of the current session key; the leftmost half of the BLAKE2b digest becomes the new session key, while the rightmost half becomes the encryption key.
const SESSION_KEYS = {};/** * Note: In reality you'll want to have two separate sessions: * One for receiving data, one for sending data. * * @param {string} identity * @param {CryptographyKey} key */async function setSessionKey(identity, key) { SESSION_KEYS[identity] = key;}async function getEncryptionKey(identity) { if (!SESSION_KEYS[identity]) { throw new Error("No session key for " + identity"); } const blake2bMac = await sodium.crypto_generichash( SESSION_KEYS[identity], null, 64 ); SESSION_KEYS[identity] = new CryptographyKey(blake2bMac.slice(0, 32)); return new CryptographyKey(blake2bMac.slice(32, 64));}
In the interest of time, a full DHRatchet implementation is left as an exercise to the reader (since it’s mostly a state machine), but using the appropriate functions provided by sodium-plus (crypto_box_keypair()
, crypto_scalarmult()
) should be relatively straightforward.
Make sure your KDFs use domain separation, as per the Signal Protocol specifications.
Group Key Agreement
The Signal Protocol specified X3DH and the Double Ratchet for securely encrypting information between two parties.
Group conversations are trickier, because you have to be able to encrypt information that multiple recipients can decrypt, add/remove participants to the conversation, etc.
(The same complexity comes with multi-device support for end-to-end encryption.)
The best design I’ve read to date for tackling group key agreement is the IETF Messaging Layer Security RFC draft.
I am not going to implement the entire MLS RFC in this blog post. If you want to support multiple devices or group conversations, you’ll want a complete MLS implementation to work with.
Brief Recap
That was a lot of ground to cover, but we’re not done yet.
(Art by Khia.)
So far we’ve tackled encryption, initial key agreement, and session key management. However, we did not flesh out how Identity Keys (which are signing keys–Ed25519 specifically–rather than Diffie-Hellman keys) are managed. That detail was just sorta hand-waved until now.
So let’s talk about that.
III. Identity Key Management
There’s a meme among technology bloggers to write a post titled “Falsehoods Programmers Believe About _____”.
Fortunately for us, Identity is one of the topics that furries are positioned to understand better than most (due to fursonas): Identities have a many-to-many relationship with Humans.
In an end-to-end encryption protocol, each identity will consist of some identifier (phone number, email address, username and server hostname, etc.) and an Ed25519 keypair (for which the public key will be published).
But how do you know whether or not a given public key is correct for a given identity?
This is where we segue into one of the hard problems in cryptography, where the solutions available are entirely dependent on your threat model: Public Key Infrastructure (PKI).
Some common PKI designs include:
- Certificate Authorities (CAs) — TLS does this
- Web-of-Trust (WoT) — The PGP ecosystem does this
- Trust On First Use (TOFU) — SSH does this
- Key Transparency / Certificate Transparency (CT) — TLS also does this for ensuring CA-issued certificates are auditable (although it was originally meant to replace Certificate Authorities)
And you can sort of choose-your-own-adventure on this one, depending on what’s most appropriate for the type of software you’re building and who your customers are.
One design I’m particularly fond of is called Gossamer, which is a PKI design without Certificate Authorities, originally designed for making WordPress’s automatic updates more secure (i.e. so every developer can sign their theme and plugin updates).
Since we only need to maintain an up-to-date repository of Ed25519 identity keys for each participant in our end-to-end encryption protocol, this makes Gossamer a suitable starting point.
Gossamer specifies a limited grammar of Actions that can be performed: AppendKey, RevokeKey, AppendUpdate, RevokeUpdate, and AttestUpdate. These actions are signed and published to an append-only cryptographic ledger.
I would propose a sixth action: AttestKey, so you can have WoT-like assurances and key-signing parties. (If nothing else, you should be able to attest that the identity keys of other cryptographic ledgers in the network are authentic at a point in time.)
IV. Backdoor Resistance
In the previous section, I proposed the use of Gossamer as a PKI for Identity Keys. This would provide Ed25519 keypairs for use with X3DH and the Double Ratchet, which would in turn provide session keys to use for symmetric authenticated encryption.
If you’ve implemented everything preceding this section, you have a full-stack end-to-end encryption protocol. But let’s make intelligence agencies and surveillance capitalists even more mad by making it impractical to backdoor our software (and impossible to silently backdoor it).
How do we pull that off?
You want Binary Transparency.
For us, the implementation is simple: Use Gossamer as it was originally intended (i.e. to secure your software distribution channels).
Gossamer provides up-to-date verification keys and a commitment to a cryptographic ledger of every software update. You can learn more about its inspiration here.
It isn’t enough to merely use Gossamer to manage keys and update signatures. You need independent third parties to use the AttestUpdate action to assert one or more of the following:
- That builds are reproducible from the source code.
- That they have reviewed the source code and found no evidence of backdoors or exploitable vulnerabilities.
(And then you should let your users decide which of these independent third parties they trust to vet software updates.)
Closing Remarks
The U.S. Government cries and moans a lot about “criminals going dark” and wonders a lot about how to solve the “going dark problem”.
If more software developers implement end-to-end encryption in their communications software, then maybe one day they won’t be able to use dragnet surveillance to spy on citizens and they’ll be forced to do actual detective work to solve actual crimes.
Y’know, like their job description actually entails?
Let’s normalize end-to-end encryption. Let’s normalize backdoor-resistant software distribution.
Let’s collectively tell the intelligence community in every sophisticated nation state the one word they don’t hear often enough:
Especially if you’re a furry. Because we improve everything! :3
Questions You Might Have
What About Private Contact Discovery?
That’s one of the major reasons why the thing we’re building isn’t meant to compete with Signal (and it MUST NOT be advertised as such):
Signal is a privacy tool, and their servers have no way of identifying who can contact who.
What we’ve built here isn’t a complete privacy solution, it’s only providing end-to-end encryption (and possibly making NSA employees cry at their desk).
Does This Design Work with Federation?
Yes. Each identifier string can be [username] at [hostname].
What About Network Metadata?
If you want anonymity, you want to use Tor.
Why Are You Using Ed25519 Keys for X3DH?
If you only read the key agreement section of this blog post and the fact that I’m passing around Ed25519 public keys seems weird, you might have missed the identity section of this blog post where I suggested piggybacking on another protocol called Gossamer to handle the distribution of Ed25519 public keys. (Gossamer is also beneficial for backdoor resistance in software update distribution, as described in the subsequent section.)
Furthermore, we’re actually using birationally equivalent X25519 keys derived from the Ed25519 keypair for the X3DH step. This is a deviation from what Signal does (using X25519 keys everywhere, then inventing an EdDSA variant to support their usage).
const publicKeyX = await sodium.crypto_sign_ed25519_pk_to_curve25519(foxPublicKey);const secretKeyX = await sodium.crypto_sign_ed25519_sk_to_curve25519(wolfSecretKey);
(Using fox/wolf instead of Alice/Bob, because it’s cuter.)
This design pattern has a few advantages:
- It makes Gossamer integration seamless, which means you can use Ed25519 for identities and still have a deniable X3DH handshake for 1:1 conversations while implementing the rest of the designs proposed.
- This approach to X3DH can be implemented entirely with libsodium functions, without forcing you to write your own cryptography implementations (i.e. for XEdDSA).
The only disadvantages I’m aware of are:
- It deviates from Signal’s core design in a subtle way that means you don’t get to claim the exact same advantages Signal does when it comes to peer review.
- Some cryptographers are distrustful of the use of birationally equivalent X25519 keys from Ed25519 keys (although there isn’t a vulnerability any of them have been able to point me to that doesn’t involve torsion groups–which libsodium’s implementation already avoids).
If these concerns are valid enough to decide against my implementation above, I invite you to talk with cryptographers about your concerns and then propose alternatives.
Has Any of This Been Implemented Already?
You can find implementations for the designs discussed on this blog post below:
- Rawr-X3DH implements X3DH in TypeScript (added 2020-11-23)
I will update this section of the blog post as implementations surface.
https://soatok.blog/2020/11/14/going-bark-a-furrys-guide-to-end-to-end-encryption/
#authenticatedEncryption #authenticatedKeyExchange #crypto #cryptography #encryption #endToEndEncryption #libsodium #OnlinePrivacy #privacy #SecurityGuidance #symmetricEncryption
Zoom recently announced that they were going to make end-to-end encryption available to all of their users–not just customers.https://twitter.com/zoom_us/status/1320760108343652352
This is a good move, especially for people living in countries with inept leadership that failed to address the COVID-19 pandemic and therefore need to conduct their work and schooling remotely through software like Zoom. I enthusiastically applaud them for making this change.
End-to-end encryption, on by default, is a huge win for everyone who uses Zoom. (Art by Khia.)
The end-to-end encryption capability arrives on the heels of their acquisition of Keybase in earlier this year. Hiring a team of security experts and cryptography engineers seems like a good move overall.
Upon hearing this news, I decided to be a good neighbor and take a look at their source code, with the reasoning, “If so many people’s privacy is going to be dependent on Zoom’s security, I might as well make sure they’re not doing something ridiculously bad.”
Except I couldn’t find their source code anywhere online. But they did publish a white paper on Github…
(Art by Khia.)
Disclaimers
What follows is the opinion of some guy on the Internet with a fursona–so whether or not you choose to take it seriously should be informed by this context. It is not the opinion of anyone’s employer, nor is it endorsed by Zoom, etc. Tell your lawyers to calm their nips.More importantly, I’m not here to hate on Zoom for doing a good thing, nor on the security experts that worked hard on making Zoom better for their users. The responsibility of security professionals is to the users, after all.
Also, these aren’t zero-days, so don’t try to lecture me about “responsible” disclosure. (That term is also problematic, by the way.)
Got it? Good. Let’s move on.
(Art by Swizz.)
Bizarre Design Choices in Version 2.3 of Zoom’s E2E White Paper
Note: I’ve altered the screenshots to be white text on a black background, since my blog’s color scheme is darker than a typical academic PDF. You can find the source here.Cryptographic Algorithms
It’s a little weird that they’re calculating a signature over SHA256(Context) || SHA256(M), considering Ed25519 uses SHA512 internally.
It would make just as much sense to sign Context || M directly–or, if pre-hashing large streams is needed, SHA512(Context || M).
At the top of this section, it says it uses libsodium’s
crypto_box
interface. But then they go onto… not actually use it.Instead, they wrote their own protocol using HKDF, two SHA256 hashes, and XChaCha20-Poly1305.
While secure, this isn’t really using the crypto_box interface.
The only part of the libsodium interface that’s being used is
[url=https://github.com/jedisct1/libsodium/blob/927dfe8e2eaa86160d3ba12a7e3258fbc322909c/src/libsodium/crypto_box/curve25519xsalsa20poly1305/box_curve25519xsalsa20poly1305.c#L35-L46]crypto_box_beforenm()[/url]
, which could easily have been a call tocrypto_scalarmult()
instead (since they’re passing the output of the scalar multiplication to HKDF anyway).(Art by Riley.)
Also, the SHA256(a) || SHA256(b) pattern returns. Zoom’s engineers must love SHA256 for some reason.
This time, it’s in the additional associated data for the XChaCha20-Poly1305.
Binding the ciphertext and the signature to the same context string is a sensible thing to do, it’s just the concatenation of SHA256 hashes is a bit weird when SHA512 exists.
Meeting Leader Security Code
Here we see Zoom using the a SHA256 of a constant string (“
Zoombase-1-ClientOnly-MAC-SecurityCode
“) in a construction that tries but fails to be HMAC.And then they concatenate it with the SHA256 hash of the public key (which is already a 256-bit value), and then they hash the whole thing again.
It’s redundant SHA256 all the way down. The redundancy of “MAC” and “SecurityCode” in their constant string is, at least, consistent with the rest of their design philosophy.
It would be a real shame if double-hashing carried the risk of invalidating security proofs, or if the security proof for HMAC required a high Hamming distance of padding constants and this design decision also later saved HMAC from related-key attacks.
Hiding Personal Details
Wait, you’re telling me Zoom was aware of HMAC’s existence this whole time?I give up!
Enough Pointless Dunking, What’s the Takeaway?
None of the design decisions Zoom made that I’ve criticized here are security vulnerabilities, but they do demonstrate an early lack of cryptography expertise in their product design.After all, the weirdness is almost entirely contained in section 3 of their white paper, which describes the “Phase I” of their rollout. So what I’ve pointed out here appears to be mostly legacy cruft that wasn’t risky enough to bother changing in their final design.
The rest of their paper is pretty straightforward and pleasant to read. Their design makes sense in general, and each phase includes an “Areas to Improve” section.
All in all, if you’re worried about the security of Zoom’s E2EE feature, the only thing they can really do better is to publish the source code (and link to it from the whitepaper repository for ease-of-discovery) for this feature so independent experts can publicly review it.
However, they seem to be getting a lot of mileage out of the experts on their payroll, so I wouldn’t count on that happening.
https://soatok.blog/2020/10/28/bizarre-design-choices-in-zooms-end-to-end-encryption/
#encryption #endToEndEncryption #zoom
As we look upon the sunset of a remarkably tiresome year, I thought it would be appropriate to talk about cryptographic wear-out.
What is cryptographic wear-out?
It’s the threshold when you’ve used the same key to encrypt so much data that you should probably switch to a new key before you encrypt any more. Otherwise, you might let someone capable of observing all your encrypted data perform interesting attacks that compromise the security of the data you’ve encrypted.
My definitions always aim to be more understandable than pedantically correct.
(Art by Swizz)
The exact value of the threshold varies depending on how exactly you’re encrypting data (n.b. AEAD modes, block ciphers + cipher modes, etc. each have different wear-out thresholds due to their composition).
Let’s take a look at the wear-out limits of the more popular symmetric encryption methods, and calculate those limits ourselves.
Specific Ciphers and Modes
(Art by Khia. Poorly edited by the author.)
Cryptographic Limits for AES-GCM
I’ve written about AES-GCM before (and why I think it sucks).
AES-GCM is a construction that combines AES-CTR with an authenticator called GMAC, whose consumption of nonces looks something like this:
- Calculating H (used in GHASH for all messages encrypted under the same key, regardless of nonce):
Encrypt(00000000 00000000 00000000 00000000)
- Calculating J0 (the pre-counter block):
- If the nonce is 96 bits long:
NNNNNNNN NNNNNNNN NNNNNNNN 00000001
where theN
spaces represent the nonce hexits.
- Otherwise:
s = 128 * ceil(len(nonce)/nonce) - len(nonce)
J0 = GHASH(H, nonce || zero(s+64) || int2bytes(len(nonce))
- If the nonce is 96 bits long:
- Each block of data encrypted uses J0 + block counter (starting at 1) as a CTR nonce.
- J0 is additionally used as the nonce to calculate the final GMAC tag.
AES-GCM is one of the algorithms where it’s easy to separately calculate the safety limits per message (i.e. for a given nonce and key), as well as for all messages under a key.
AES-GCM Single Message Length Limits
In the simplest case (nonce is 96 bits), you end up with the following nonces consumed:
- For each key:
00000000 00000000 00000000 00000000
- For each (nonce, key) pair:
NNNNNNNN NNNNNNNN NNNNNNNN 000000001
for J0NNNNNNNN NNNNNNNN NNNNNNNN 000000002
for encrypting the first 16 bytes of plaintextNNNNNNNN NNNNNNNN NNNNNNNN 000000003
for the next 16 bytes of plaintext…- …
NNNNNNNN NNNNNNNN NNNNNNNN FFFFFFFFF
for the final 16 bytes of plaintext.
From here, it’s pretty easy to see that you can encrypt the blocks from 00000002
to FFFFFFFF
without overflowing and creating a nonce reuse. This means that each (key, nonce) can be used to encrypt a single message up to blocks of the underlying ciphertext.
Since the block size of AES is 16 bytes, this means the maximum length of a single AES-GCM (key, nonce) pair is bytes (or 68,719,476,480 bytes). This is approximately 68 GB or 64 GiB.
Things get a bit tricker to analyze when the nonce is not 96 bits, since it’s hashed.
The disadvantage of this hashing behavior is that it’s possible for two different nonces to produce overlapping ranges of AES-CTR output, which makes the security analysis very difficult.
However, this hashed output is also hidden from network observers since they do not know the value of H. Without some method of reliably detecting when you have an overlapping range of hidden block counters, you can’t exploit this.
(If you want to live dangerously and motivate cryptanalysis research, mix 96-bit and non-96-bit nonces with the same key in a system that does something valuable.)
Multi-Message AES-GCM Key Wear-Out
Now that we’ve established the maximum length for a single message, how many messages you can safely encrypt under a given AES-GCM key depends entirely on how your nonce is selected.
If you have a reliable counter, which is guaranteed to never repeat, and start it at 0 you can theoretically encrypt messages safely. Hooray!
Hooray!
(Art by Swizz)
You probably don’t have a reliable counter, especially in real-world settings (distributed systems, multi-threaded applications, virtual machines that might be snapshotted and restored, etc.).
Confound you, technical limitations!
(Art by Swizz)
Additionally (thanks to 2adic for the expedient correction), you cannot safely encrypt more than blocks with AES because the keystream blocks–as the output of a block cipher–cannot repeat.
Most systems that cannot guarantee unique incrementing nonces simply generate nonces with a cryptographically secure random number generator. This is a good idea, but no matter how high quality your random number generator is, random functions will produce collisions with a discrete probability.
If you have possible values, you should expect a single collision(with 50% probability) after (or )samples. This is called the birthday bound.
However, 50% of a nonce reuse isn’t exactly a comfortable safety threshold for most systems (especially since nonce reuse will cause AES-GCM to become vulnerable to active attackers). 1 in 4 billion is a much more comfortable safety margin against nonce reuse via collisions than 1 in 2. Fortunately, you can calculate the discrete probability of a birthday collision pretty easily.
If you want to rekey after your collision probability exceeds (for a random nonce between 0 and ), you simply need to re-key after messages.
AES-GCM Safety Limits
- Maximum message length: bytes
- Maximum number of messages (random nonce):
- Maximum number of messages (sequential nonce): (but you probably don’t have this luxury in the real world)
- Maximum data safely encrypted under a single key with a random nonce: about bytes
Not bad, but we can do better.
(Art by Khia.)
Cryptographic Limits for ChaCha20-Poly1305
The IETF version of ChaCha20-Poly1305 uses 96-bit nonces and 32-bit internal counters. A similar analysis follows from AES-GCM’s, with a few notable exceptions.
For starters, the one-time Poly1305 key is derived from the first 32 bytes of the ChaCha20 keystream output (block 0) for a given (nonce, key) pair. There is no equivalent to AES-GCM’s H parameter which is static for each key. (The ChaCha20 encryption begins using block 1.)
Additionally, each block for ChaCha20 is 512 bits, unlike AES’s 128 bits. So the message limit here is a little more forgiving.
Since the block size is 512 bits (or 64 bytes), and only one block is consumed for Poly1305 key derivation, we can calculate a message length limit of , or 274,877,906,880 bytes–nearly 256 GiB for each (nonce, key) pair.
The same rules for handling 96-bit nonces applies as with AES-GCM, so we can carry that value forward.
ChaCha20-Poly1305 Safety Limits
- Maximum message length: bytes
- Maximum number of messages (random nonce):
- Maximum number of messages (sequential nonce): (but you probably don’t have this luxury in the real world)
- Maximum data safely encrypted under a single key with a random nonce: about bytes
A significant improvement, but still practically limited.
(Art by Khia.)
Cryptographic Limits for XChaCha20-Poly1305
XChaCha20-Poly1305 is a variant of XSalsa20-Poly1305 (as used in libsodium) and the IETF’s ChaCha20-Poly1305 construction. It features 192-bit nonces and 32-bit internal counters.
XChaCha20-Poly1305 is instantiated by using HChaCha20 of the key over the first 128 bits of the nonce to produce a subkey, which is used with the remaining nonce bits using the aforementioned ChaCha20-Poly1305.
This doesn’t change the maximum message length, but it does change the number of messages you can safely encrypt (since you’re actually using up to distinct keys).
Thus, even if you manage to repeat the final ChaCha20-Poly1305 nonce, as long as the total nonce differs, each encryptions will be performed with a distinct key (thanks to the HChaCha20 key derivation; see the XSalsa20 paper and IETF RFC draft for details).
UPDATE (2021-04-15): It turns out, my read of the libsodium implementation was erroneous due to endian-ness. The maximum message length for XChaCha20-Poly1305 is blocks, and for AEAD_XChaCha20_Poly1305 is blocks. Each block is 64 bytes, so that changes the maximum message length to about . This doesn’t change the extended-nonce details, just the underlying ChaCha usage.
XChaCha20-Poly1305 Safety Limits
- Maximum message length: bytes (earlier version of this document said
) - Maximum number of messages (random nonce):
- Maximum number of messages (sequential nonce): (but you probably don’t have this luxury in the real world)
- Maximum data safely encrypted under a single key with a random nonce: about bytes
I can see encrypt forever, man.
(Art by Khia.)
Cryptographic Limits for AES-CBC
It’s tempting to compare non-AEAD constructions and block cipher modes such as CBC (Cipher Block Chaining), but they’re totally different monsters.
- AEAD ciphers have a clean delineation between message length limit and the message quantity limit
- CBC and other cipher modes do not have this separation
Every time you encrypt a block with AES-CBC, you are depleting from a universal bucket that affects the birthday bound security of encrypting more messages under that key. (And unlike AES-GCM with long nonces, AES-CBC’s IV is public.)
This is in addition to the operational requirements of AES-CBC (plaintext padding, initialization vectors that never repeat and must be unpredictable, separate message authentication since CBC doesn’t provide integrity and is vulnerable to chosen-ciphertext atacks, etc.).
My canned response to most queries about AES-CBC.
(Art by Khia.)
For this reason, most cryptographers don’t even bother calculating the safety limit for AES-CBC in the same breath as discussing AES-GCM. And they’re right to do so!
If you find yourself using AES-CBC (or AES-CTR, for that matter), you’d best be performing a separate HMAC-SHA256 over the ciphertext (and verifying this HMAC with a secure comparison function before decrypting). Additionally, you should consider using an extended nonce construction to split one-time encryption and authentication keys.
(Art by Riley.)
However, for the sake of completeness, let’s figure out what our practical limits are.
CBC operates on entire blocks of plaintext, whether you need the entire block or not.
On encryption, the output of the previous block is mixed (using XOR) with the current block, then encrypted with the block cipher. For the first block, the IV is used in the place of a “previous” block. (Hence, its requirements to be non-repeating and unpredictable.)
This means you can informally model (IV xor PlaintextBlock) and (PBn xor PBn+1) as a pseudo-random function, before it’s encrypted with the block cipher.
If those words don’t mean anything to you, here’s the kicker: You can use the above discussion about birthday bounds to calculate the upper safety bounds for the total number of blocks encrypted under a single AES key (assuming IVs are generated from a secure random source).
If you’re okay with a 50% probability of a collision, you should re-key after blocks have been encrypted.
https://www.youtube.com/watch?v=v0IsYNDMV7A
If your safety margin is closer to the 1 in 4 billion (as with AES-GCM), you want to rekey after blocks.
However, blocks encrypted doesn’t map neatly to bytes encrypted.
If your plaintext is always an even multiple of 128 bits (or 16 bytes), this allows for up to bytes of plaintext. If you’re using PKCS#7 padding, keep in mind that this will include an entire padding block per message, so your safety margin will deplete a bit faster (depending on how many individual messages you encrypt, and therefore how many padding blocks you need).
On the other extreme (1-byte plaintexts), you’ll only be able to eek encrypted bytes before you should re-key.
Therefore, to stay within the safety margin of AES-CBC, you SHOULD re-key after blocks (including padding) have been encrypted.
Keep in mind: single-byte blocks is still approximately 281 TiB of data (including padding). On the upper end, 15-byte blocks (with 1-byte padding to stay within a block) clocks in at about or about 4.22 PiB of data.
That’s Blocks. What About Bytes?
The actual plaintext byte limit sans padding is a bit fuzzy and context-dependent.
The local extrema occurs if your plaintext is always 16 bytes (and thus requires an extra 16 bytes of padding). Any less, and the padding fits within one block. Any more, and the data😛adding ratio starts to dominate.
Therefore, the worst case scenario with padding is that you take the above safety limit for block counts, and cut it in half. Cutting a number in half means reducing the exponent by 1.
But this still doesn’t eliminate the variance. blocks could be anywhere from to bytes of real plaintext. When in situations like this, we have to assume the worst (n.b. take the most conservative value).
Therefore…
AES-CBC Safety Limits
- Maximum data safely encrypted under a single key with a random nonce: bytes (approximately 141 TiB)
Yet another reason to dislike non-AEAD ciphers.
(Art by Khia.)
Take-Away
Compared to AES-CBC, AES-GCM gives you approximately a million times as much usage out of the same key, for the same threat profile.
ChaCha20-Poly1305 and XChaCha20-Poly1305 provides even greater allowances of encrypting data under the same key. The latter is even safe to use to encrypt arbitrarily large volumes of data under a single key without having to worry about ever practically hitting the birthday bound.
I’m aware that this blog post could have simply been a comparison table and a few footnotes (or even an IETF RFC draft), but I thought it would be more fun to explain how these values are derived from the cipher constructions.
(Art by Khia.)
https://soatok.blog/2020/12/24/cryptographic-wear-out-for-symmetric-encryption/
#AES #AESCBC #AESGCM #birthdayAttack #birthdayBound #cryptography #safetyMargin #SecurityGuidance #symmetricCryptography #symmetricEncryption #wearOut
There seems to be a lot of interest among software developers in the various cryptographic building blocks (block ciphers, hash functions, etc.), and more specifically how they stack up against each other.Today, we’re going to look at how some symmetric encryption methods stack up against each other.
If you’re just looking for a short list of cryptographic “right answers”, your cheat sheet can be found on Latacora’s blog.
Comparisons
- AES-GCM vs. ChaCha20-Poly1305
- AES-GCM vs. XChaCha20-Poly1305
- AES-GCM vs. AES-CCM
- AES-GCM vs. AES-GCM-SIV
- AES-GCM vs. AES-SIV
- AES-GCM-SIV vs. AES-SIV
- AES-GCM vs. AES-CBC
- AES-GCM vs. AES-CTR
- AES-CBC vs. AES-CTR
- AES-CBC vs. AES-ECB
- AES vs. Blowfish
- ChaCha vs. Salsa20
- ChaCha vs. RC4
- Cipher Cascades
AES-GCM vs. ChaCha20-Poly1305
- If you have hardware acceleration (e.g. AES-NI), then AES-GCM provides better performance. If you do not, AES-GCM is either slower than ChaCha20-Poly1305, or it leaks your encryption keys in cache timing.
- Neither algorithm is message committing, which makes both unsuitable for algorithms like OPAQUE (explanation).
- AES-GCM can target multiple security levels (128-bit, 192-bit, 256-bit), whereas ChaCha20-Poly1305 is only defined at the 256-bit security level.
- Nonce size:
- AES-GCM: Varies, but standard is 96 bits (12 bytes). If you supply a longer nonce, this gets hashed down to 16 bytes.
- ChaCha20-Poly1305: The standardized version uses 96-bit nonces (12 bytes), but the original used 64-bit nonces (8 bytes).
- Wearout of a single (key, nonce) pair:
- AES-GCM: Messages must be less than 2^32 – 2 blocks (a.k.a. 2^36 – 32 bytes, a.k.a. 2^39 – 256 bits). This also makes the security analysis of AES-GCM with long nonces complicated, since the hashed nonce doesn’t start with the lower 4 bytes set to 00 00 00 02.
- ChaCha20-Poly1305: ChaCha has an internal counter (32 bits in the standardized IETF variant, 64 bits in the original design).
- Neither algorithm is nonce misuse resistant.
Conclusion: Both are good options. AES-GCM can be faster with hardware support, but pure-software implementations of ChaCha20-Poly1305 are almost always fast and constant-time.
AES-GCM vs. XChaCha20-Poly1305
- XChaCha20 accepts 192-bit nonces (24 bytes). The first 16 of the nonce are used with the ChaCha key to derive a subkey, and then the rest of this algorithm is the same as ChaCha20-Poly1305.
- To compare AES-GCM and ChaCha20-Poly1305 for encryption, see above.
- The longer nonce makes XChaCha20-Poly1305 better suited for long-lived keys (i.e. application-layer cryptography) than AES-GCM.
Conclusion: If you’re using the same key for a large number of messages, XChaCha20-Poly1305 has a wider safety margin than AES-GCM. Therefore, XChaCha20-Poly1305 should be preferred in those cases.
AES-GCM vs. AES-CCM
AES-GCM is AES in Galois/Counter Mode, AES-CCM is AES in Counter with CBC-MAC mode.Although I previously stated that AES-GCM is possibly my least favorite AEAD, AES-CCM is decidedly worse: AES-GCM is Encrypt-then-MAC, while AES-CCM is MAC-then-encrypt.
Sure, CCM mode has a security proof that arguably justifies violating the cryptographic doom principle, but I contend the only time it’s worthwhile to do that is when you’re building a nonce-misuse resistant mode (i.e. AES-GCM-SIV).
A lot of cryptography libraries simply don’t even implement AES-CCM; or if they do, it’s disabled by default (i.e. OpenSSL). A notable exception is the Stanford Javascript Cryptography Library, which defaults to AES-CCM + PBKDF2 for encryption.
Conclusion: Just use AES-GCM.
AES-GCM vs. AES-GCM-SIV
AES-GCM-SIV encryption runs at 70% the speed of AES-GCM, but decryption is just as fast. What does this 30% encryption slowdown buy? Nonce misuse resistance.Nonce misuse resistance is really cool. (Art by Swizz)
The algorithms are significantly different:
- AES-GCM is basically AES-CTR, then GMAC (parameterized by the key and nonce) is applied over the AAD and ciphertext. (Encrypt then MAC)
- AES-GCM-SIV derives two distinct keys from the nonce and key, then uses POLYVAL (which is related to GHASH) over the AAD and message with the first key to generate the tag. Then the tag used to derive a series of AES inputs that, when encrypted with the second key, are XORed with the blocks of the message (basically counter mode). (MAC then Encrypt)
AES-GCM is a simpler algorithm to analyze. AES-GCM-SIV provides a greater safety margin. However, like AES-GCM, AES-GCM-SIV is also vulnerable to the Invisible Salamanders attack.
So really, use which ever you want.
Better security comes from AES-GCM-SIV, better encryption performance comes from AES-GCM. What are your priorities?
https://twitter.com/colmmacc/status/986286693572493312
Conclusion: AES-GCM-SIV is better, but both are fine.
AES-GCM vs. AES-SIV
At the risk of being overly reductionist, AES-SIV is basically a nonce misuse resistant variant of AES-CCM:
- Where AES-CCM uses CBC-MAC, AES-SIV uses CMAC, which is based on CBC-MAC but with a doubling step (left shift then XOR with the round constant).
- AES-SIV is MAC then encrypt (so is AES-CCM).
- AES-SIV uses AES-CTR (so does AES-CCM).
If you need nonce misuse resistance, AES-SIV is a tempting choice, but you’re going to get better performance out of AES-GCM.
AES-GCM also has the added advantage of not relying on CBC-MAC.
Conclusion: Prefer AES-GCM in most threat models, AES-SIV in narrower threat models where nonce misuse is the foremost security risk.
AES-GCM-SIV vs. AES-SIV
If you read the previous two sections, the conclusion here should be obvious.
- AES-GCM-SIV is slightly better than AES-GCM.
- AES-GCM is better than AES-SIV.
Conclusion: Use AES-GCM-SIV.
AES-GCM vs. AES-CBC
Just use AES-GCM. No contest.AES-GCM is an authenticated encryption mode. It doesn’t just provide confidentiality by encrypting your message, it also provides integrity (which guarantees that nobody tampered with the encrypted message over the wire).
If you select AES-CBC instead of AES-GCM, you’re opening your systems to a type of attack called a padding oracle (which lets attackers decrypt messages without the key, by replaying altered ciphertexts and studying the behavior of your application).
If you must use AES-CBC, then you must also MAC your ciphertext (and the initialization vector–IV for short). You should also devise some sort of key-separation mechanism so you’re not using the same key for two different algorithms. Even something like this is fine:
- encKey := HmacSha256(“encryption-cbc-hmac”, key)
- macKey := HmacSha256(“authentication-cbc-hmac”, key)
- iv := RandomBytes(16)
- ciphertext := AesCbc(plaintext, iv, encKey)
- tag := HmacSha256(iv + ciphertext, macKey)
For decryption you need a secure compare function. If one is not available to you, or you cannot guarantee it will run in constant time, a second HMAC call with a random per-comparison key will suffice.
There is no possible world in which case unauthenticated AES-CBC is a safer choice than AES-GCM.
AES-CBC + HMAC-SHA256 (encrypt then MAC) is message-committing and therefore can be safely used with algorithms like OPAQUE.
The Signal Protocol uses AES-CBC + HMAC-SHA2 for message encryption.
AES-GCM vs. AES-CTR
Just use AES-GCM. No contest.Unlike AES-GCM, AES-CTR doesn’t provide any message integrity guarantees. However, strictly speaking, AES-GCM uses AES-CTR under the hood.
If you must use AES-CTR, the same rules apply as for AES-CBC:
- encKey := HmacSha256(“encryption-ctr-hmac”, key)
- macKey := HmacSha256(“authentication-ctr-hmac”, key)
- nonce := RandomBytes(16)
- ciphertext := AesCtr(plaintext, nonce, encKey)
- tag := HmacSha256(nonce + ciphertext, macKey)
For decryption you need a secure compare function.
AES-CTR + HMAC-SHA256 (encrypt then MAC) is message-committing and therefore can be safely used with algorithms like OPAQUE.
AES-CBC vs. AES-CTR
If you find yourself trying to decide between CBC mode and CTR mode, you should probably save yourself the headache and just use GCM instead.That being said:
AES-CTR fails harder than AES-CBC when you reuse an IV/nonce.
AES-CBC requires a padding scheme (e.g. PKCS #7 padding) which adds unnecessary algorithmic complexity.
If you have to decide between the two, and you have a robust extended-nonce key-splitting scheme in place, opt for AES-CTR. But really, unless you’re a cryptography engineer well-versed in the nuances and failure modes of these algorithms, you shouldn’t even be making this choice.
AES-CBC vs. AES-ECB
Never use ECB mode. ECB mode lacks semantic security.Block cipher modes that support initialization vectors were invented to compensate for this shortcoming.
Conclusion: If you’re trying to decide between these two, you’ve already lost. Rethink your strategy.
AES vs. Blowfish
A lot of OpenVPN configurations in the wild default to Blowfish for encryption. To the authors of these configuration files, I have but one question:Why?! (Art by Khia)
Sure, you might think, “But Blowfish supports up to 448-bit keys and is therefore more secure than even 256-bit AES.”
Cryptographic security isn’t a dick-measuring contest. Key size isn’t everything. More key isn’t more security.
AES is a block cipher with a 128-bit block size. Blowfish is a block cipher with a 64-bit block size. This means that Blowfish in CBC mode is vulnerable to birthday attacks in a practical setting.
AES has received several orders of magnitude more scrutiny from cryptography experts than Blowfish has.
Conclusion: Use AES instead of Blowfish.
ChaCha vs. Salsa20
Salsa20 is an eSTREAM finalist stream cipher. After years of cryptanalysis, reduced round variants of Salsa20 (specifically, Salsa20/7 with a 128-bit key) were found to be breakable. In response to this, a variant called ChaCha was published that increased the per-round diffusion.That is to say: ChaCha is generally more secure than Salsa20 with similar or slightly better performance. If you have to choose between the two, go for ChaCha.
Conclusion: Your choice (both are good but ChaCha is slightly better).
ChaCha vs. RC4
Don’t use RC4 for anything! What are you doing?My reaction when I read that the CIA was using a modified RC4 in their Assassin malware instead of a secure stream cipher, per the Vault7 leaks. (Art by Khia)
RC4 was a stream cipher–allegedly designed by Ron Rivest and leaked onto a mailing list–that has been thoroughly demolished by cryptanalysis. RC4 is not secure and should never be relied on for security.
Conclusion: Use ChaCha. Never use RC4.
Cipher Cascades
A cipher cascade is when you encrypt a message with one cipher, and then encrypt the ciphertext with another cipher, sometimes multiple times. One example: TripleSec by Keybase, which combines AES and Salsa20 (and, formerly, Twofish–an AES finalist).Cipher cascades don’t meaningfully improve security in realistic threat models. However, if your threat model includes “AES is broken or backdoored by the NSA”, a cipher cascade using AES is safer than just selecting a nonstandard cipher instead of AES. However, they’re necessarily slower than just using AES would be.
If you’re worried about this, your time is better spent worrying about key management, side-channel attacks, and software supply chain attacks.
Conclusion: Avoid cipher cascades, but they’re better than recklessly paranoid alternatives.
Symmetric Encryption Rankings
So with all of the above information, can we rank these algorithms into tiers?Art by Riley
Sort of! Although it’s based on the above analyses, ranking is inherently subjective. So what follows is entirely the author’s opinion of their relative goodness/badness.
S XChaCha20-Poly1305, AES-GCM-SIV A AES-GCM, ChaCha20-Poly1305 B AES-SIV C AES-CTR + HMAC-SHA2, AES-CBC + HMAC-SHA2 D AES-CCM F Any: AES-ECB, RC4, Blowfish
Unauthenticated: AES-CBC, AES-CTR, Salsa20, ChaChaSoatok’s ranking of symmetric encryption methods
https://soatok.blog/2020/07/12/comparison-of-symmetric-encryption-methods/#AEAD #AES #AESGCM #AESGCMSIV #ChaCha20Poly1305 #ciphers #comparison #cryptography #encryption #NMRAEAD #ranking #SecurityGuidance #streamCiphers #symmetricCryptography #symmetricEncryption #XChaCha20Poly1305
There seems to be a lot of interest among software developers in the various cryptographic building blocks (block ciphers, hash functions, etc.), and more specifically how they stack up against each other.
Today, we’re going to look at how some symmetric encryption methods stack up against each other.
If you’re just looking for a short list of cryptographic “right answers”, your cheat sheet can be found on Latacora’s blog.
Comparisons
- AES-GCM vs. ChaCha20-Poly1305
- AES-GCM vs. XChaCha20-Poly1305
- AES-GCM vs. AES-CCM
- AES-GCM vs. AES-GCM-SIV
- AES-GCM vs. AES-SIV
- AES-GCM-SIV vs. AES-SIV
- AES-GCM vs. AES-CBC
- AES-GCM vs. AES-CTR
- AES-CBC vs. AES-CTR
- AES-CBC vs. AES-ECB
- AES vs. Blowfish
- ChaCha vs. Salsa20
- ChaCha vs. RC4
- Cipher Cascades
AES-GCM vs. ChaCha20-Poly1305
- If you have hardware acceleration (e.g. AES-NI), then AES-GCM provides better performance. If you do not, AES-GCM is either slower than ChaCha20-Poly1305, or it leaks your encryption keys in cache timing.
- Neither algorithm is message committing, which makes both unsuitable for algorithms like OPAQUE (explanation).
- AES-GCM can target multiple security levels (128-bit, 192-bit, 256-bit), whereas ChaCha20-Poly1305 is only defined at the 256-bit security level.
- Nonce size:
- AES-GCM: Varies, but standard is 96 bits (12 bytes). If you supply a longer nonce, this gets hashed down to 16 bytes.
- ChaCha20-Poly1305: The standardized version uses 96-bit nonces (12 bytes), but the original used 64-bit nonces (8 bytes).
- Wearout of a single (key, nonce) pair:
- AES-GCM: Messages must be less than 2^32 – 2 blocks (a.k.a. 2^36 – 32 bytes, a.k.a. 2^39 – 256 bits). This also makes the security analysis of AES-GCM with long nonces complicated, since the hashed nonce doesn’t start with the lower 4 bytes set to 00 00 00 02.
- ChaCha20-Poly1305: ChaCha has an internal counter (32 bits in the standardized IETF variant, 64 bits in the original design).
- Neither algorithm is nonce misuse resistant.
Conclusion: Both are good options. AES-GCM can be faster with hardware support, but pure-software implementations of ChaCha20-Poly1305 are almost always fast and constant-time.
AES-GCM vs. XChaCha20-Poly1305
- XChaCha20 accepts 192-bit nonces (24 bytes). The first 16 of the nonce are used with the ChaCha key to derive a subkey, and then the rest of this algorithm is the same as ChaCha20-Poly1305.
- To compare AES-GCM and ChaCha20-Poly1305 for encryption, see above.
- The longer nonce makes XChaCha20-Poly1305 better suited for long-lived keys (i.e. application-layer cryptography) than AES-GCM.
Conclusion: If you’re using the same key for a large number of messages, XChaCha20-Poly1305 has a wider safety margin than AES-GCM. Therefore, XChaCha20-Poly1305 should be preferred in those cases.
AES-GCM vs. AES-CCM
AES-GCM is AES in Galois/Counter Mode, AES-CCM is AES in Counter with CBC-MAC mode.
Although I previously stated that AES-GCM is possibly my least favorite AEAD, AES-CCM is decidedly worse: AES-GCM is Encrypt-then-MAC, while AES-CCM is MAC-then-encrypt.
Sure, CCM mode has a security proof that arguably justifies violating the cryptographic doom principle, but I contend the only time it’s worthwhile to do that is when you’re building a nonce-misuse resistant mode (i.e. AES-GCM-SIV).
A lot of cryptography libraries simply don’t even implement AES-CCM; or if they do, it’s disabled by default (i.e. OpenSSL). A notable exception is the Stanford Javascript Cryptography Library, which defaults to AES-CCM + PBKDF2 for encryption.
Conclusion: Just use AES-GCM.
AES-GCM vs. AES-GCM-SIV
AES-GCM-SIV encryption runs at 70% the speed of AES-GCM, but decryption is just as fast. What does this 30% encryption slowdown buy? Nonce misuse resistance.
Nonce misuse resistance is really cool. (Art by Swizz)
The algorithms are significantly different:
- AES-GCM is basically AES-CTR, then GMAC (parameterized by the key and nonce) is applied over the AAD and ciphertext. (Encrypt then MAC)
- AES-GCM-SIV derives two distinct keys from the nonce and key, then uses POLYVAL (which is related to GHASH) over the AAD and message with the first key to generate the tag. Then the tag used to derive a series of AES inputs that, when encrypted with the second key, are XORed with the blocks of the message (basically counter mode). (MAC then Encrypt)
AES-GCM is a simpler algorithm to analyze. AES-GCM-SIV provides a greater safety margin. However, like AES-GCM, AES-GCM-SIV is also vulnerable to the Invisible Salamanders attack.
So really, use which ever you want.
Better security comes from AES-GCM-SIV, better encryption performance comes from AES-GCM. What are your priorities?
https://twitter.com/colmmacc/status/986286693572493312
Conclusion: AES-GCM-SIV is better, but both are fine.
AES-GCM vs. AES-SIV
At the risk of being overly reductionist, AES-SIV is basically a nonce misuse resistant variant of AES-CCM:
- Where AES-CCM uses CBC-MAC, AES-SIV uses CMAC, which is based on CBC-MAC but with a doubling step (left shift then XOR with the round constant).
- AES-SIV is MAC then encrypt (so is AES-CCM).
- AES-SIV uses AES-CTR (so does AES-CCM).
If you need nonce misuse resistance, AES-SIV is a tempting choice, but you’re going to get better performance out of AES-GCM.
AES-GCM also has the added advantage of not relying on CBC-MAC.
Conclusion: Prefer AES-GCM in most threat models, AES-SIV in narrower threat models where nonce misuse is the foremost security risk.
AES-GCM-SIV vs. AES-SIV
If you read the previous two sections, the conclusion here should be obvious.
- AES-GCM-SIV is slightly better than AES-GCM.
- AES-GCM is better than AES-SIV.
Conclusion: Use AES-GCM-SIV.
AES-GCM vs. AES-CBC
Just use AES-GCM. No contest.
AES-GCM is an authenticated encryption mode. It doesn’t just provide confidentiality by encrypting your message, it also provides integrity (which guarantees that nobody tampered with the encrypted message over the wire).
If you select AES-CBC instead of AES-GCM, you’re opening your systems to a type of attack called a padding oracle (which lets attackers decrypt messages without the key, by replaying altered ciphertexts and studying the behavior of your application).
If you must use AES-CBC, then you must also MAC your ciphertext (and the initialization vector–IV for short). You should also devise some sort of key-separation mechanism so you’re not using the same key for two different algorithms. Even something like this is fine:
- encKey := HmacSha256(“encryption-cbc-hmac”, key)
- macKey := HmacSha256(“authentication-cbc-hmac”, key)
- iv := RandomBytes(16)
- ciphertext := AesCbc(plaintext, iv, encKey)
- tag := HmacSha256(iv + ciphertext, macKey)
For decryption you need a secure compare function. If one is not available to you, or you cannot guarantee it will run in constant time, a second HMAC call with a random per-comparison key will suffice.
There is no possible world in which case unauthenticated AES-CBC is a safer choice than AES-GCM.
AES-CBC + HMAC-SHA256 (encrypt then MAC) is message-committing and therefore can be safely used with algorithms like OPAQUE.
The Signal Protocol uses AES-CBC + HMAC-SHA2 for message encryption.
AES-GCM vs. AES-CTR
Just use AES-GCM. No contest.
Unlike AES-GCM, AES-CTR doesn’t provide any message integrity guarantees. However, strictly speaking, AES-GCM uses AES-CTR under the hood.
If you must use AES-CTR, the same rules apply as for AES-CBC:
- encKey := HmacSha256(“encryption-ctr-hmac”, key)
- macKey := HmacSha256(“authentication-ctr-hmac”, key)
- nonce := RandomBytes(16)
- ciphertext := AesCtr(plaintext, nonce, encKey)
- tag := HmacSha256(nonce + ciphertext, macKey)
For decryption you need a secure compare function.
AES-CTR + HMAC-SHA256 (encrypt then MAC) is message-committing and therefore can be safely used with algorithms like OPAQUE.
AES-CBC vs. AES-CTR
If you find yourself trying to decide between CBC mode and CTR mode, you should probably save yourself the headache and just use GCM instead.
That being said:
AES-CTR fails harder than AES-CBC when you reuse an IV/nonce.
AES-CBC requires a padding scheme (e.g. PKCS #7 padding) which adds unnecessary algorithmic complexity.
If you have to decide between the two, and you have a robust extended-nonce key-splitting scheme in place, opt for AES-CTR. But really, unless you’re a cryptography engineer well-versed in the nuances and failure modes of these algorithms, you shouldn’t even be making this choice.
AES-CBC vs. AES-ECB
Never use ECB mode. ECB mode lacks semantic security.
Block cipher modes that support initialization vectors were invented to compensate for this shortcoming.
Conclusion: If you’re trying to decide between these two, you’ve already lost. Rethink your strategy.
AES vs. Blowfish
A lot of OpenVPN configurations in the wild default to Blowfish for encryption. To the authors of these configuration files, I have but one question:
Why?! (Art by Khia)
Sure, you might think, “But Blowfish supports up to 448-bit keys and is therefore more secure than even 256-bit AES.”
Cryptographic security isn’t a dick-measuring contest. Key size isn’t everything. More key isn’t more security.
AES is a block cipher with a 128-bit block size. Blowfish is a block cipher with a 64-bit block size. This means that Blowfish in CBC mode is vulnerable to birthday attacks in a practical setting.
AES has received several orders of magnitude more scrutiny from cryptography experts than Blowfish has.
Conclusion: Use AES instead of Blowfish.
ChaCha vs. Salsa20
Salsa20 is an eSTREAM finalist stream cipher. After years of cryptanalysis, reduced round variants of Salsa20 (specifically, Salsa20/7 with a 128-bit key) were found to be breakable. In response to this, a variant called ChaCha was published that increased the per-round diffusion.
That is to say: ChaCha is generally more secure than Salsa20 with similar or slightly better performance. If you have to choose between the two, go for ChaCha.
Conclusion: Your choice (both are good but ChaCha is slightly better).
ChaCha vs. RC4
Don’t use RC4 for anything! What are you doing?
My reaction when I read that the CIA was using a modified RC4 in their Assassin malware instead of a secure stream cipher, per the Vault7 leaks. (Art by Khia)
RC4 was a stream cipher–allegedly designed by Ron Rivest and leaked onto a mailing list–that has been thoroughly demolished by cryptanalysis. RC4 is not secure and should never be relied on for security.
Conclusion: Use ChaCha. Never use RC4.
Cipher Cascades
A cipher cascade is when you encrypt a message with one cipher, and then encrypt the ciphertext with another cipher, sometimes multiple times. One example: TripleSec by Keybase, which combines AES and Salsa20 (and, formerly, Twofish–an AES finalist).
Cipher cascades don’t meaningfully improve security in realistic threat models. However, if your threat model includes “AES is broken or backdoored by the NSA”, a cipher cascade using AES is safer than just selecting a nonstandard cipher instead of AES. However, they’re necessarily slower than just using AES would be.
If you’re worried about this, your time is better spent worrying about key management, side-channel attacks, and software supply chain attacks.
Conclusion: Avoid cipher cascades, but they’re better than recklessly paranoid alternatives.
Symmetric Encryption Rankings
So with all of the above information, can we rank these algorithms into tiers?
Art by Riley
Sort of! Although it’s based on the above analyses, ranking is inherently subjective. So what follows is entirely the author’s opinion of their relative goodness/badness.
S | XChaCha20-Poly1305, AES-GCM-SIV |
A | AES-GCM, ChaCha20-Poly1305 |
B | AES-SIV |
C | AES-CTR + HMAC-SHA2, AES-CBC + HMAC-SHA2 |
D | AES-CCM |
F | Any: AES-ECB, RC4, Blowfish Unauthenticated: AES-CBC, AES-CTR, Salsa20, ChaCha |
Soatok’s ranking of symmetric encryption methods
https://soatok.blog/2020/07/12/comparison-of-symmetric-encryption-methods/
#AEAD #AES #AESGCM #AESGCMSIV #ChaCha20Poly1305 #ciphers #comparison #cryptography #encryption #NMRAEAD #ranking #SecurityGuidance #streamCiphers #symmetricCryptography #symmetricEncryption #XChaCha20Poly1305
Authenticated Key Exchanges are an interesting and important building block in any protocol that aims to allow people to communicate privately over an untrusted medium (i.e. the Internet).What’s an AKE?
At their core, Authenticated Key Exchanges (AKEs for short) combine two different classes of protocol.
- An authentication mechanism, such as a MAC or a digital signature.
- Key encapsulation, usually through some sort of Diffie-Hellman.
A simple example of an AKE is the modern TLS handshake, which uses digital signatures (X.509 certificates signed by certificate authorities) to sign ephemeral Elliptic Curve Diffie-Hellman (ECDH) public keys, which is then used to derive a shared secret to encrypt and authenticate network traffic.
I guess I should say “simple” with scare quotes. Cryptography is very much a “devil’s in the details” field, because my above explanation didn’t even encapsulate mutual-auth TLS or the underlying machinery of protocol negotiation. (Or the fact that non-forward-secret ciphersuites can be selected.)
AKEs get much more complicated, the more sophisticated your threat model becomes.
For example: Signal’s X3DH and Double Ratchet protocols are components of a very sophisticated AKE. Learn more about them here.
The IETF is working to standardize their own approach, called Messaging Layer Security (MLS), which uses a binary tree of ECDH handshakes to manage state and optimize group operations (called TreeKEM). You can learn more about IETF MLS here.
Password AKEs
Recently, a collection of cryptographers at the IETF’s Crypto Research Forum Group (CFRG) decided to hammer on a series of proposed Password-Authenticated Key Exchange (PAKE) protocols.PAKEs come in two flavors: Balanced (mutually authenticated) and augmented (one side is a prover, the other is a verifier). Balanced PAKEs are good for encrypted tunnels where you control both endpoints (e.g. WiFi networks), whereas Augmented PAKEs are great for eliminating the risk of password theft in client-server applications, if the server gets hacked.
Ultimately, the CFRG settled on one balanced PAKE (CPace) and one augmented PAKE (OPAQUE).
Consequently, cryptographer Filippo Valsorda managed to implement CPace in 125 lines of Go, using Ristretto255.
I implemented the CPace PAKE yesterday with Go and ristretto255, and it felt like cheating.125 lines of code! Really happy with it and it was a lot of fun.
— Filippo Valsorda (@FiloSottile) March 29, 2020
Why So Complicated?
At the end of the day, an AKE is just a construction that combines key encapsulation with an authentication mechanism.But how you combine these components together can vary wildly!
Some AKE designs (i.e. Dragonfly, in WPA3) are weaker than others; even if only in the sense of being difficult to implement in constant-time.
The reason there’s so many is that cryptographers tend to collectively decide which algorithms to recommend for standardization.
(n.b. There are a lot more block ciphers than DES, Blowfish, and AES to choose from! But ask a non-cryptographer to name five block ciphers and they’ll probably struggle.)
https://soatok.blog/2020/04/21/authenticated-key-exchanges/
#ake #authenticatedKeyExchange #cryptography #ECDH