Inside NIST’s PQC: Kyber, Dilithium, and SPHINCS+

Marin Ivezic March 28, 2023

44 minutes read

Introduction

The race to develop post-quantum cryptography (PQC) has produced new algorithms designed to withstand attacks by quantum computers. In 2022, after a multi-year evaluation, NIST selected CRYSTALS-Kyber, CRYSTALS-Dilithium, and SPHINCS+ as the first algorithms for standardization in public-key encryption (key encapsulation) and digital signatures. Kyber is an encryption/key-establishment scheme (a Key Encapsulation Mechanism, KEM) based on lattice problems, while Dilithium (also lattice-based) and SPHINCS+ (hash-based) are digital signature schemes.

This article provides a technical deep dive into how these algorithms work, analyzes their mathematical foundations, and compares them with classical schemes (RSA, ECC) and other PQC candidates (NTRU, BIKE, Rainbow, etc.). We also discuss implementation considerations (performance benchmarks, protocol integration, adoption challenges) and examine their security assumptions and known attack vectors (resistance to quantum/classical attacks, potential weaknesses).

CRYSTALS-Kyber: Lattice-Based Key Encapsulation Mechanism

Kyber is an IND-CCA2 secure KEM whose security relies on the hardness of the Learning-with-Errors (LWE) problem over module lattices. In simple terms, LWE asks one to solve noisy linear equations in a high-dimensional vector space, which is believed to be intractable even for quantum adversaries. Kyber uses a structured form of LWE (module-LWE) for efficiency: operations are done in a polynomial ring of dimension $$n=256$$ with coefficients mod $$q=3329$$, and small random “noise” is added to hide secrets. Three parameter sets (Kyber-512, 768, 1024) offer security roughly equivalent to AES-128, AES-192, and AES-256, respectively. The highest security level uses larger module rank (k=4) to raise the difficulty of the lattice problem.

Key Generation

Kyber’s public key consists of a random matrix $$A$$ (of size $$k\times k$$, with entries as polynomials in $$R_q = \mathbb{Z}_q[x]/(x^n+1)$$) and a vector $$t = A \cdot s_1 + e_1$$ where $$s_1, e_1$$ are secret small-norm polynomials. Here $$s_1$$ is drawn from a noise distribution (e.g. coefficients in $${-η,\dots,η}$$) and $$e_1$$ is an “error” vector to hide the exact product. The secret key includes $$s_1$$ (and optionally $$e_1$$) and a copy of the public key or a hash thereof. The smallness of $$s_1,e_1$$ (e.g. coefficients sampled from a centered binomial distribution) is crucial for correctness and security. Kyber sets these parameters so that decryption errors occur with negligible probability (e.g. ~$$2^{-164}$$) , ensuring reliability and thwarting any attack that relies on observing decryption failures.

Encapsulation (Encryption)

To encapsulate a symmetric key, the sender generates a random ephemeral secret $$r$$ (a small-norm polynomial vector) and computes the ciphertext as two parts: $$c_1 = A^T \cdot r + e_2$$ and $$c_2 = t^T \cdot r + e_3 + \text{Enc}(M)$$. Here $$M$$ is a fixed-size message (e.g. 32 bytes) that encodes the randomly chosen key or a hash of it, and $$e_2, e_3$$ are additional small error vectors. Intuitively, $$(c_1, c_2)$$ is an LWE encryption of message $$M$$ under public key $$(A,t)$$: $$c_1$$ hides $$r$$ and $$c_2$$ carries the message masked by the shared secret. In Kyber, $$M$$ is not supplied by the user but is internally derived to achieve CCA security (explained below). The polynomial arithmetic (matrix-vector multiplication and additions) is efficiently implemented using the Number Theoretic Transform (NTT) to perform polynomial multiplication in $$O(n \log n)$$ time. This yields high speed: e.g. Kyber-768 encryption can be done in tens of thousands of CPU cycles or less with optimized code , outperforming classical RSA in many cases.

Decapsulation (Decryption)

The recipient uses their secret $$s_1$$ to recover the message. They compute $$M’ = c_2 – s_1^T \cdot c_1$$ (which equals the transmitted $$M$$ with only small error from $$e_2,e_3$$) and then apply a deterministic “decode” to recover the exact $$M$$. Given the design of parameters, the subtraction $$v – s^T u$$ in Kyber (referencing $$v=c_2$$, $$u=c_1$$, $$s=s_1$$) cancels out the $$t^T \cdot r$$ term and leaves $$M$$ (encoded as a polynomial) plus only the difference of two small errors. The error is so small relative to the modulus that the correct high-order bits of the polynomial coefficients can be decoded to obtain the original message $$M$$. Once $$M$$ (the encapsulated key material) is recovered, both parties can hash it (via a KDF) into the final symmetric key $$K$$. If decapsulation fails (e.g. due to an unexpected decryption error or an invalid ciphertext), Kyber returns a random key to avoid leaking information to attackers.

Fujisaki-Okamoto (FO) Transform for CCA Security

Kyber’s basic encryption scheme as described is IND-CPA secure. To achieve IND-CCA2 security (resistance to adaptive chosen-ciphertext attacks), Kyber uses a Fujisaki–Okamoto transform with a twist. In the encapsulation process, the sender actually generates a random message $$M$$ and computes a pair $$(K, r) = G(M || \text{pk})$$ by hashing $$M$$ and the recipient’s public key with a function $$G$$. The value $$r$$ is used as the randomness for the encryption step described above (instead of choosing $$r$$ directly), and $$K$$ is the preliminary shared secret. The ciphertext $$(c_1,c_2)$$ is then formed as an encryption of $$M$$ using $$r$$. On the receiver side, after using $$s_1$$ to recover $$M’$$, the decapsulator recomputes $$(K’, r’) = G(M’ || \text{pk})$$ and re-encrypts $$M’$$ to get $$(c_1’,c_2’)$$. If the result matches the received $$(c_1,c_2)$$, the message was valid and $$K’=K$$ is used as the shared key; if not, a failure is detected and a backup key (e.g. hash of the ciphertext) is output instead. This FO transform thwarts chosen-ciphertext attacks by making any modification to the ciphertext lead to a completely random output key (so attackers gain no advantage by querying decryption oracle with tampered ciphertexts). As a result, Kyber.CCAKEM is a full IND-CCA2 KEM, with a security reduction to the hardness of module-LWE (plus the assumptions of the random oracle used in FO).

Kyber Performance and Parameters

Thanks to its reliance on efficient polynomial arithmetic and small integer operations, Kyber is very fast and has relatively small key sizes. A Kyber-768 public key is about 1,184 bytes and the ciphertext about 1,088 bytes. For comparison, RSA-3072 public keys are ~384 bytes (modulus) but ciphertexts/signatures are also 384 bytes, whereas Kyber’s are slightly larger but still practical. In exchange, Kyber’s computation is much lighter than RSA’s large integer exponentiations – it mainly uses vectorized polynomial multiplication (NTT) and hashing. In fact, Cloudflare observed that Kyber requires less computation than classical X25519 (ECC) key agreement, albeit with larger message sizes. When used in a TLS 1.3 handshake (in a hybrid X25519+Kyber mode), the latency impact is minimal – purely post-quantum Kyber key exchange could even be faster than classical key exchange on its own. NIST highlighted Kyber’s “comparatively small encryption keys” and high speed among lattice algorithms. Secret keys can also be stored as just a 32-byte seed (and the public key) since all other components can be deterministically regenerated, keeping long-term key storage minimal. Overall, Kyber offers a balanced trade-off: strong quantum-resistant security, efficient implementation, and moderate sizes.

CRYSTALS-Dilithium: Lattice-Based Digital Signatures

Dilithium is a digital signature scheme built on the hardness of finding short vectors in lattices – specifically the Module-SIS (Short Integer Solution) and Module-LWE problems. It uses a Fiat–Shamir with aborts paradigm, originally proposed by Lyubashevsky, to construct a lattice-based variant of the Schnorr signature idea. In Dilithium, there is no trapdoor as in RSA or traditional lattice trapdoor schemes; instead, the signer shows knowledge of a secret lattice vector by producing a signature that passes certain norm checks without revealing the secret. Dilithium’s design goals included ease of secure implementation (no reliance on Gaussian sampling), conservative parameters, and minimizing signature+key size. Indeed, among lattice signatures, Dilithium achieves one of the smallest combined public key and signature sizes for a given security level.

Key Generation

The public key includes a matrix $$A \in R_q^{k \times \ell}$$ (with $$R_q = \mathbb{Z}_q[X]/(X^n+1)$$ and typically $$n=256$$, $$q=2^{23}-2^{13}+1$$ as a prime number) and a vector $$t = A \cdot s_1 + s_2$$ mod $$q$$. Here, $$s_1 \in R_q^{\ell}$$ and $$s_2 \in R_q^k$$ are the secret key components, sampled from a uniform small range (e.g. each coefficient in $$[-\eta, \eta]$$). In essence, $$t$$ is a “public commitment” to the secret $$s_1$$ (with $$s_2$$ as masking noise). The secret key is $$(s_1, s_2)$$ and the public key is $$(A, t)$$, plus a random seed that was used to pseudorandomly generate $$A$$. The structure is analogous to a Module-SIS instance: given $$(A, t)$$ it should be hard to find a short vector $$(x_1, x_2)$$ such that $$A x_1 + x_2 = t$$ (which would break the signature’s one-wayness).

Signature Generation

To sign a message $$\mu$$, the signer first expands $$\mu$$ with a salt and hashes it (to avoid issues with random oracle programmability, Dilithium includes the message hash in the process). Then:

1. Sampling a Mask: The signer selects a random “masking” vector $$y \in R_q^{\ell}$$, with each coefficient sampled uniformly from a range $$[-\gamma_1, \gamma_1)$$ for some $$\gamma_1$$. This $$y$$ is like a one-time randomizer to hide the secret.

2. Computing the Commitment: The signer computes $$w = A \cdot y$$ (a vector in $$R_q^k$$) and splits each coefficient of $$w$$ into high and low bits: $$w = w_1 \cdot 2^{\gamma_2} + w_0$$, where $$w_0$$ contains the $$\gamma_2$$ least significant bits of each coefficient, and $$w_1$$ contains the rest (the “high-order bits”). The parameter $$\gamma_2$$ is chosen such that $$w_0$$ is small (each coefficient of $$w_0$$ is bounded by $$\gamma_2$$ in absolute value). The vector $$w_1$$ (which essentially is $$w$$ with each coefficient rounded to the nearest multiple of $$2^{\gamma_2}$$) is then used to derive the challenge.

3. Hash to Challenge: A cryptographic hash (SHAKE-256 in the reference design) is applied to the concatenation of the message $$\mu$$ (or its hash) and $$w_1$$ to produce a pseudorandom challenge $$c$$. The challenge $$c$$ is a polynomial with a restricted form: it has exactly $$\tau$$ coefficients equal to $$±1$$ and the rest $$0$$ (for instance, $$\tau=39$$ or $$49$$ for Dilithium’s security levels). This structure gives $$c$$ a limited norm (small Hamming weight) and sufficient entropy (the $$±1$$ positions act as the random challenge).

4. Respond with Signature Components: The signer computes $$z = y + c \cdot s_1$$. Here $$s_1$$ is the secret short vector from the key, and $$c\cdot s_1$$ means component-wise polynomial multiplication of $$c$$ with each polynomial in $$s_1$$ (since $$c$$ itself is a single polynomial in $$R_q$$). The vector $$z$$ is intended to be the witness proving knowledge of $$s_1$$. Along with $$z$$, the signer also needs to provide a piece of information to help the verifier recompute the correct $$w_1$$. Dilithium accomplishes this by including a hint that allows the verifier to derive $$w_1$$ from $$A z – c t$$ without ambiguity. In practice, this hint is a set of bits (often denoted $$h$$) indicating where an adjustment was made in rounding $$A z – c t$$ to recover the original $$w_1$$. (The latest version of Dilithium optimizes this by incorporating the hint into the challenge or signature in a compressed way, saving a few bytes.)

5. Rejection Sampling: After computing $$z$$, the signer checks two conditions: (a) All coefficients of $$z$$ must be small, i.e. $$|z_i| < \gamma_1 – \beta$$ for some bound $$\beta$$. This ensures that $$z$$ does not leak too much about $$s_1$$ (if $$c$$ was large or $$z$$ coefficients wrapped around mod $$q$$, it could reveal $$s_1$$ bits). (b) The low bits of $$A z – c t$$ (which should equal $$w_0$$ if everything is consistent) must also be small, specifically bounded by $$\gamma_2 – \beta$$. If either condition is violated, the signature is rejected before output and the process restarts with a new random $$y$$. This is the “with aborts” aspect – with appropriately chosen parameters the expected number of trials is small (around 4), so signing is still efficient. These rejections ensure that the distribution of $$z$$ (and $$w_0$$) that the attacker sees is independent of the secret key, which is crucial for zero-knowledge properties.

If the algorithm does not abort, the output signature is $$(c, z)$$ (along with the optional hint or a compressed form of $$w_1$$ in some implementations). In Dilithium’s final specification, $$c$$ is included as a 32-byte seed (which expands deterministically to the actual polynomial with $$±1$$ coefficients). For example, in the Dilithium-II parameter set, $$z$$ consists of $$\ell=4$$ polynomials mod $$q$$ and $$c$$ has $$\tau=39$$ non-zero ±1 coefficients; the signature is about 2420 bytes.

Verification

To verify, the public key $$(A, t)$$ and message $$\mu$$ are used to recompute the challenge: the verifier computes $$w’ = A \cdot z – c \cdot t$$ and splits $$w’$$ into high and low bits (using the same $$\gamma_2$$). Using the hint (if provided), the verifier obtains the original $$w_1$$ and checks that the hash of $$(\mu, w_1)$$ produces the challenge $$c$$ given in the signature. The verifier also checks that $$z$$’s coefficients are within the required bounds (ensuring the signer did not cheat the rejection sampling). If all checks pass, the signature is valid. The correctness stems from: $$A z – c t = A(y + c s_1) – c(A s_1 + s_2) = A y – c s_2$$. Since $$A y = w_1 \cdot 2^{\gamma_2} + w_0$$ and $$c s_2$$ is small (because $$c$$ has few ±1 and $$s_2$$ is small), the high part of $$A z – c t$$ will equal $$w_1$$ with high probability. The design ensures a valid signature implies the existence of short $$(s_1, s_2)$$ that the signer must know, under the hardness of SIS/LWE problems.

Dilithium Parameters and Efficiency

Dilithium offers different security levels: e.g., Dilithium2, 3, 5 corresponding to NIST security categories 2, 3, 5. The public key sizes range from 1312 bytes (Dilithium2) to 2592 bytes (Dilithium5), and signature sizes from 2420 bytes up to 4595 bytes. These are significantly larger than RSA or ECC keys/signatures, but they are among the smallest in post-quantum context for comparable security. Importantly, Dilithium uses only uniform sampling (no Gaussian sampling) for simplicity and side-channel safety, and all operations (matrix multiplies, rounding) can be made constant-time. The heavy mathematical operation is the multiplication of polynomials in $$R_q$$; like Kyber, Dilithium leverages NTT-based polynomial multiplication for speed. This makes Dilithium quite fast: on a modern CPU, signing takes on the order of a few million cycles and verification even less, which is in the same ballpark as RSA-2048 signing/verifying (though Dilithium’s operations are very different, involving hashing and vector math rather than big integer exponentiations). NIST notes that Dilithium has “strong security and excellent overall performance,” making it the primary recommended PQC signature. In cases where an application needs even smaller signature sizes, NIST also standardized FALCON (another lattice signature with ~666-byte signatures using NTRU lattices), but FALCON involves more complex math (floating-point Gaussian sampling) and is considered a secondary choice. Dilithium’s design prioritizes simplicity and robustness at the cost of slightly larger signatures than Falcon. For most applications (e.g. TLS certificates, code signing), a 2–4 KB signature is acceptable, so Dilithium is expected to see broad use. It’s also worth noting that Dilithium’s public keys are relatively small (e.g. ~1.3 KB) compared to some other PQ signatures, which is beneficial in certificate chains.

SPHINCS+: Stateless Hash-Based Signatures

SPHINCS+ is a stateless hash-based digital signature scheme. Unlike Kyber and Dilithium, its security does not rely on algebraic problems; it is based solely on the security of underlying cryptographic hash functions (e.g. SHA-256, SHAKE256). Hash-based signatures have been studied for decades (dating back to Lamport’s one-time signatures and Merkle trees) and are considered extremely conservative: if the hash functions behave like ideal random functions, the scheme is secure. SPHINCS+ was chosen by NIST as a backup signature scheme because it uses a very different hardness assumption (hash functions) than the lattice-based Dilithium and Falcon. This diversity is valuable in case an unexpected breakthrough occurs against lattice problems. The trade-off is that SPHINCS+ signatures are much larger and slower to generate than lattice-based signatures. However, SPHINCS+ has small keys (a few bytes) and allows virtually unlimited signatures per key without maintaining any state.

High-Level Structure

SPHINCS+ combines three key techniques: (1) one-time signatures (OTS), specifically an optimized Winternitz OTS called WOTS+; (2) few-time signatures (FTS), specifically a scheme called FORS (Forest of Random Subsets) which can sign a message digest a limited number of times; and (3) a layered Merkle tree hierarchy (hypertree) to aggregate many one-time/few-time keys under one long-term public key. The design is intricate, but the idea is to leverage Merkle trees to authenticate many OTS public keys with one small root value. The stateless design means the signer need not track which OTS keys have been used – SPHINCS+ can safely reuse the same secret seed to derive fresh keys for each signature via randomization. This avoids the Achilles’ heel of earlier hash-based schemes like XMSS that required state management to ensure one-time keys weren’t reused.

Key Generation

A SPHINCS+ key pair is derived from a few master seeds. The public key is essentially the top-level Merkle tree root (plus maybe some public seed info), which is a short digest (e.g. 16, 24 or 32 bytes depending on security level). The secret key is a seed (plus optional secondary seeds) used to generate all other keys internally. For example, one seed may be used to generate the WOTS+ private keys for all leaves in all trees, and another seed is used for randomization (to randomize the message hashing). Because everything is generated pseudorandomly from seeds, the actual stored keys are very small (typically 64 bytes total for SPHINCS+ secret key, and 32 bytes for the public key). This is a stark contrast to lattice or code-based schemes where public keys are hundreds of bytes to kilobytes. SPHINCS+ keys are so small that including them in certificates or protocols is trivial – the cost lies entirely in the signature size and computation.

Signature Generation

To sign a message, SPHINCS+ performs several steps, descending through its hypertree:

1. Message Digest and Randomization: First, a hash of the message is computed. SPHINCS+ uses a randomized hashing approach (a random nonce is hashed with the message to ensure security against attackers with quantum access to the hash oracle). This produces a message digest $$M_{\text{dig}}$$ of a fixed length (e.g. 256 bits). The scheme then partitions this digest into parts to be signed by the FORS component.

2. FORS Few-Time Signature: The message digest $$M_{\text{dig}}$$ is split into $$k$$ pieces of $$a$$ bits each (with $$k$$ and $$a$$ as parameters such that $$k \cdot a$$ equals the digest length). Each piece is interpreted as an index into a binary hash tree of height $$a$$. The signer has $$k$$ such trees (each tree has $$2^a$$ leaves). The secret seeds generate one secret value for every leaf in each of these $$k$$ trees. To sign each $$a$$-bit portion $$m_i$$ of the digest, the signer reveals the secret value at leaf index $$m_i$$ in the $$i$$-th tree, along with the authentication path (the chain of node hashes up to the root of that tree). This is analogous to Horst/FORS in the original SPHINCS: by revealing the one leaf corresponding to the message chunk and a path, the verifier can compute the root of each of the $$k$$ trees. The $$k$$ roots (each a hash output of length $$n$$ bytes) are then concatenated or hashed together to form a single output – essentially the FORS signature yields one combined value that is authenticated by higher layers. The few-time property comes from the fact that if you sign more than a certain number of messages with the same trees, an attacker might gather enough leaf information to reconstruct a root; but SPHINCS+ never reuses these trees across signatures because each signature uses fresh randomness to choose a subset (or an ADRS – address domain separation – to derive new keys). In SPHINCS+, the expected number of uses of each FORS tree is kept very low (well below the security level), hence “few-time”.

3. WOTS+ One-Time Signatures: The result of the FORS stage is essentially an intermediate public key (the $$k$$ tree roots compressed into one value). Now, SPHINCS+ signs this value using a one-time signature (WOTS+). WOTS+ works by taking the hash value and using it to determine how many times to hash each element of a WOTS private key chain. A WOTS+ private key is a vector of $$w$$ secret bitstrings, and the corresponding public key is those bitstrings hashed a fixed number of times (where $$w$$ is the Winternitz parameter controlling time vs size). To sign a hash value, the signer for each bitstring in the private key either reveals it directly or hashes it some number of times such that the verifier, by further hashing, can reach the public key value if and only if the correct hash value was used. In practice, WOTS+ signature consists of a set of pseudo-random values (one for each element of the chain) and the verifier hashes each value progressively to compute the WOTS+ public key, then compares it to a known public key. In SPHINCS+, WOTS+ is mainly used as a building block in the Merkle tree: each WOTS+ public key is associated with a Merkle tree leaf.

4. Merkle Tree Authentication (Hypertree): The WOTS+ signature just described is used at each layer of a hypertree. SPHINCS+ organizes its many one-time key pairs into a hypertree with $$d$$ layers, each of height $$h/d$$ (where $$h$$ is the overall tree height, a parameter). At the bottom layer, there are $$2^h$$ WOTS+ key pairs, but they are grouped into $$d$$ layers of smaller subtrees to limit the total tree size. The FORS+WOTS signature effectively operates at the lowest layer: the FORS output is signed by a WOTS+ key which is a leaf of the lowest Merkle tree. The authentication path for that WOTS+ leaf is included, allowing the verifier to get the root of the lowest tree. That root is then signed by a WOTS+ key of the next layer’s tree, and so on, climbing up the hypertree. The signature includes one WOTS+ signature and Merkle path per layer. Finally, at the top layer, the verifier obtains the top-root of the entire hypertree, which should match the public key root stored initially. If it matches, the signature is valid. Essentially, the hypertree approach allows $$2^h$$ signatures per public key (where $$h$$ might be e.g. 60), by dividing into $$d$$ layers so that storing one giant $$2^{60}$$ tree is not needed – instead $$d$$ smaller subtrees are computed on the fly as needed. SPHINCS+ is stateless because the signer can derive new leaf keys for each signature by using distinct “address” inputs to the PRF that generates keys, rather than having to keep track of used ones.

Putting it together

A SPHINCS+ signature contains: the random nonce $$R$$, the $$k$$ revealed leaf nodes and auth paths from the FORS trees, the $$d$$ WOTS+ signatures (one per layer), and the $$d$$ Merkle authentication paths for those WOTS+ public keys. This yields a large but fixed-size signature. For the 128-bit security level, SPHINCS+ offers two main parameter sets: 128s (small) and 128f (fast). “Small” means smaller signature (and public key) at the cost of more computation, while “fast” means faster signing/verification but larger signatures. For example, SPHINCS+-128s produces a signature of 7,856 bytes, whereas SPHINCS+-128f produces 17,088 bytes. Similarly, at 192-bit security, signatures are ~16 KB (192s) or ~35 KB (192f), and at 256-bit security up to ~49 KB. These sizes are significantly larger than Dilithium’s few kilobytes. Verification in SPHINCS+ involves a lot of hashing (but these can be vectorized or parallelized to some extent), and signature generation involves even more hashing (to generate tree nodes on the fly). As an example, a SPHINCS+ signer might perform on the order of $$10^6$$ to $$10^7$$ hash function calls for one signature in the reference implementation. This is why SPHINCS+ signing is relatively slow (milliseconds rather than microseconds on a desktop CPU).

Security

The security of SPHINCS+ reduces to the second-preimage resistance and few-target collision resistance of the hash functions it uses. Even a quantum computer provides only a quadratic speedup (Grover’s algorithm) for searching preimages, which can be countered by doubling output lengths. Indeed, SPHINCS+ chooses parameters so that even with quantum attacks, its security level holds (e.g. using 256-bit hashes to target 128-bit post-quantum security, since Grover’s algorithm would reduce 256-bit security to 128-bit). One advantage of hash-based signatures is that they are backed by the very well-studied properties of hash functions – there are “no known structural attacks” that drastically weaken good hash functions in the way that quantum algorithms weaken RSA/ECC. The scheme’s soundness was proven in the random oracle model, and additional research has provided tight security proofs under certain assumptions. Another upside is that SPHINCS+ is stateless, eliminating the risk of accidental state reuse that plagued earlier hash-based schemes (where reusing a one-time key twice could be catastrophic). The cost, as noted, is large signatures and slower performance, which is why SPHINCS+ is recommended mainly as a niche (for high-assurance applications) or as a hedge against unforeseen quantum advances. Its small public key (just a 256-bit hash) is a positive in scenarios like certificate chains or identity-based schemes where key size matters more than signature size. NIST selected SPHINCS+ to “provide diversity” in PQC, citing the high confidence in hash-based security despite the scheme being “somewhat larger and slower” than lattice-based ones.

Comparison with Classical Cryptography (RSA, ECC)

Security Basis

The fundamental difference between these PQC schemes and classical ones is the underlying hard problem. Classical public-key cryptography relies on problems like integer factorization (RSA) or elliptic curve discrete logarithms (ECC) which are vulnerable to Shor’s quantum algorithm – a sufficiently large quantum computer can solve those in polynomial time, breaking RSA/ECC completely. By contrast, Kyber and Dilithium rely on lattice problems (LWE, SIS) and SPHINCS+ on hash functions; no efficient quantum algorithm is known for these problems. Shor’s algorithm does not apply to lattices or hash functions, and while Grover’s algorithm can brute-force search, it only gives a quadratic speedup, which is mitigated by choosing larger key/output sizes (e.g. using 256-bit keys/hashes to target 128-bit post-quantum security). In short, RSA/ECC would be rendered insecure by a quantum computer, whereas Kyber, Dilithium, and SPHINCS+ are designed to remain secure. Classical algorithms are deterministic trapdoor systems (e.g. RSA’s prime factors, ECC’s scalar discrete log), whereas post-quantum algorithms often rely on average-case hardness (lattice random instances) or symmetric primitives with no trapdoor.

Key Sizes and Structures

Classical schemes produce very compact signatures and moderately small public keys. For example, an ECDSA signature on Curve25519 is 64 bytes, and the public key is 32 bytes. RSA-2048 signatures are 256 bytes, public key ~256 bytes. In contrast, lattice-based schemes use high-dimensional objects (matrices or polynomials) for security, leading to larger keys: Dilithium’s public key is ~1.3–2.6 KB and signatures 2.4–4.5 KB. Kyber KEM public keys are ~800–1568 bytes and ciphertexts 768–1568 bytes depending on parameter set. These are larger than RSA/ECC, but not prohibitively so (on the order of a few kilobytes). SPHINCS+ is the outlier with tiny public keys (~32 bytes) but huge signature sizes (tens of KB). In scenarios like TLS, larger public keys and signatures increase handshake sizes. For example, replacing a 256-byte ECDSA signature with a 2700-byte Dilithium signature in a certificate will increase the TLS handshake by a couple of kilobytes, which is usually acceptable. But a 17 KB SPHINCS+ signature would bloat it significantly, so SPHINCS+ might not be used for high-volume protocols unless bandwidth is not an issue. Another structural difference is that classical schemes have simple verification equations (e.g. $$m^e \equiv s^e \mod N$$ for RSA, or point multiplication checks on ECC), whereas PQC schemes involve more complex algorithms (hash tree traversal for SPHINCS+, polynomial arithmetic for lattice schemes).

Efficiency (Speed)

In terms of computational efficiency, lattice-based cryptography can actually be very competitive with, or even faster than, classical cryptography. Kyber encryption and Dilithium signing mostly use fast polynomial multiplication and hashing, which modern CPUs can do quickly (often benefiting from SIMD instructions). In practice, Kyber key exchange can be faster than ECDH (X25519) because it requires fewer big integer operations. Dilithium signing is also quite fast – its throughput in software can be thousands of signatures per second on a single core, similar to RSA or ECC signature rates (though verification is faster for Dilithium than signing, which is opposite of RSA). One important consideration is memory and CPU features: lattice algorithms benefit from cache-friendly FFT/NTT operations and sometimes use more RAM (for storing matrices or precomputed tables) than elliptic curve ops. Classical RSA/ECC have the benefit of very mature, hardware-accelerated implementations (e.g. many CPUs have AES or big-integer arithmetic accelerators that indirectly speed up certain ECC operations), whereas PQC code is still being optimized. That said, early benchmarks show PQC schemes are practical: for instance, a Dilithium2 signature can be generated in ~50 microseconds on a modern laptop and verified in ~160 microseconds, versus an ECDSA P-256 signature in ~70 microseconds and verify in ~210 microseconds – same order of magnitude. Kyber encapsulation can be as low as ~50 microseconds (vs. perhaps ~100 microseconds for an ECDH key agreement). Thus, performance is not a major barrier. Energy consumption studies have even found that lattice schemes can be more energy-efficient than RSA/ECC for IoT devices , since they complete faster and use simpler operations.

Ciphertext/Signature Overhead

Classical encryption (RSA) typically outputs ciphertext the same size as the modulus (e.g. 256 bytes for RSA-2048), whereas Kyber’s ciphertext is around 800–1500 bytes. For a given plaintext/key to send, Kyber’s overhead is larger but still small in absolute terms ( <2 KB). In digital signatures, the difference is more pronounced: sending a Dilithium signature (≈2–3 KB) instead of an ECDSA signature (64 bytes) is a 50x increase. This might impact use cases like blockchain or low-bandwidth channels. However, it’s worth noting that signature and key sizes scale differently with security. To get 128-bit classical security, RSA would need ~3072-bit keys (384-byte moduli) and 384-byte signatures – already more than 6x larger than ECDSA-256. To get 256-bit security, RSA keys would need to be enormous (~15k bits), something hardly anyone uses. Dilithium at level 5 provides ≥ 256-bit quantum security with a 4.5 KB signature and 2.6 KB public key, which is still smaller than what RSA-15360 would be. So beyond the 128-bit security level, classical schemes become totally impractical, whereas lattice schemes scale more gently. ECC scales better than RSA, but large ECC (e.g. 521-bit curve for ~256-bit security) still has 132-byte signatures and similar size public keys, which is much smaller than any PQ alternative at that security strength. In summary, for 128-bit security, expect PQC signatures to be 1–2 orders of magnitude larger than ECC and PQC public keys to be up to an order of magnitude larger, except SPHINCS+ which is even larger in signature size.

Certificates and Protocol Integration

A big practical consideration is how these size differences play out in protocols. In a TLS handshake, we typically transmit a server’s certificate chain (which contains a public key and signature from an issuer). Replacing those with PQC algorithms means each certificate’s signature and subject public key might grow. A Dilithium-signed certificate with a Dilithium public key might be on the order of 5–6 KB in size (versus ~1 KB today with ECC certs). This is manageable on modern networks (TLS handshake packets of a few kilobytes are fine), but it might slightly increase handshake latency, especially on high-latency links or if multiple round trips are needed for very large cert chains. SPHINCS+ would be problematic here because a single signature of 17 KB could make the certificate 10× bigger – hence SPHINCS+ is less likely to be used in standard TLS certificates; it might find use in specialized contexts (like secure software updates, where bandwidth is less constrained but long-term security is paramount).

Summary

Classical crypto wins on compactness and decades of optimization, but is doomed against quantum attacks. PQC (Kyber, Dilithium) brings quantum safety at the cost of moderately larger sizes and algorithmic complexity that implementers must carefully handle. Already, experiments show a hybrid TLS handshake (ECDHE + Kyber) only adds a “very small” latency overhead (~1-2%) compared to classical alone. From a structural viewpoint, deploying PQC will change some assumptions – for instance, TLS implementers must handle larger messages, and hardware designers are exploring accelerators for polynomial math instead of just big-integers. But overall, the security gain (resistance to future quantum threats) is achieved with an acceptable trade-off in performance and size for Kyber and Dilithium. SPHINCS+ is more of a specialist tool, showcasing that if one is willing to handle large signatures, one can rely only on hash functions and get extremely robust security guarantees (even against hypothetical future math advances).

Comparison with Other Post-Quantum Candidates (not selected by NIST)

NIST’s PQC competition saw many submissions. Kyber, Dilithium, and SPHINCS+ were ultimately chosen, but other finalists and candidates are worth noting, either for their alternate approaches or lessons learned from their cryptanalysis. Below we compare our chosen algorithms with some notable not-selected schemes:

NTRU and NTRU Prime (Lattice – NTRU Family)

NTRU is a family of lattice-based encryption algorithms older than most – first proposed in 1996. An NTRU public key is a polynomial with certain properties in $$\mathbb{Z}_q[x]/(x^n-1)$$, and encryption uses polynomial multiplication and inversion mod a large prime. Variants like NTRU-HRSS and NTRU Prime were in the competition. They are similar in spirit to Kyber (both are lattice-based KEMs) but with some differences: NTRU directly works in a polynomial ring with convolution product, rather than using LWE matrices. NTRU also has no explicit “error” term – the hardness comes from finding a short polynomial that is a quotient of two public polynomials.

Performance-wise, NTRU encryption and decryption are fast and key sizes are on the same order as Kyber (a bit larger for comparable security, but same ballpark of ~1-2 KB). In fact, OpenSSH has already implemented a hybrid of Streamlined NTRU Prime with X25519 for testing purposes.

NIST did not select NTRU in the first round likely because Kyber and Saber showed slightly better efficiency and had simpler constructions. However, NTRU is very well studied and has survived cryptanalysis for over two decades. It remains a strong alternate candidate and may yet be standardized in future (NTRU Prime was in NIST’s “additional consideration” pool). One difference: Falcon, the other lattice signature standardized, is actually built on NTRU lattices – it uses the NTRU trapdoor for signatures. That shows NTRU’s versatility. In summary, NTRU encryption offers similar security to Kyber but was edged out due to Kyber’s more straightforward module-LWE foundation and excellent performance and key size combination.

Saber (Lattice – Module-LWR)

Saber was another lattice KEM finalist (very close in design to Kyber). It is based on the Learning-with-Rounding (LWR) problem, which is like LWE but instead of adding an error term, the ciphertext is “rounded” to introduce noise. Saber used power-of-two moduli and had the advantage of not requiring a number-theoretic transform (making it easier to implement on certain platforms, e.g., no need for NTT primes).

Performance and sizes of Saber were comparable to Kyber (Saber’s keys a bit larger, but ciphertexts a bit smaller; overall within 10-20%). NIST noted Saber’s performance was also excellent. The reason only Kyber was picked could be to avoid standardizing two so-similar lattice KEMs – Kyber and Saber both were strong, but Kyber had a slight edge in consolidated advantages (and the community might rally around one standard for interoperability). Saber remains a viable algorithm and could be standardized later if needed for diversity (it uses a different modulus and no Gaussian sampling either).

BIKE and HQC (Code-Based KEMs)

BIKE (Bit Flipping Key Encapsulation) and HQC (Hamming Quasi-Cyclic) were code-based encryption schemes (finalists in the KEM category). Their security is based on the difficulty of decoding random linear codes (related to the NP-hard syndrome decoding problem). BIKE in particular is built on QC-MDPC (quasi-cyclic moderate-density parity-check) codes. It uses a bit-flipping decoding algorithm with some probability of failure.

Advantages: Code-based schemes like BIKE use problems that have been studied since the 1970s (e.g., McEliece’s classic code-based cryptosystem). They are believed to be very secure (no quantum algorithm much better than classical is known for general decoding).

Disadvantages: The public keys tend to be large (for BIKE and HQC on the order of tens of thousands of bits, or a few KB, and Classic McEliece even uses public keys around a megabyte for top security levels). Also, BIKE and HQC had some efficiency issues: BIKE’s decoder can be slow and had potential vulnerabilities to side channels (because a decoding failure or certain weight patterns could leak information). In fact, during the process, cryptanalysts found structural attacks on some code-based proposals (not completely breaking them, but enough that parameters had to be tweaked).

BIKE made it to the 4th round as a candidate for future standardization, but NIST didn’t choose it in the first batch. The rationale given was that lattice-based schemes offered much better performance and smaller keys, while code-based schemes offer longer-term confidence but at a cost. BIKE’s private keys are small (like an index to a sparse polynomial), similar to lattice schemes, but its public keys and ciphertexts are larger and its key generation/decapsulation slower. For example, BIKE level-1 had a public key ~1270 bytes (still not bad) but a ciphertext ~ – note: likely a bit over a KB – and needed iterative decoding.

Classic McEliece (another code finalist) is extremely conservative (unbroken since 1978) but uses a 1 MB public key, which is impractical for many uses despite very fast encryption/decryption. In summary, code-based KEMs like BIKE and HQC were not selected immediately, but NIST is “continuing into another round to evaluate efficient code-based algorithms” for potential standardization. If lattice schemes ever encounter problems, these code-based alternatives will be the next line of defense, trading off bandwidth for proven security.

Rainbow (Multivariate Signature)

Rainbow was a finalist signature scheme based on the multivariate quadratic equations problem (MQ). Its public key is a set of quadratic polynomials for which the signer knows a trapdoor that allows solving them for given outputs. Historically, multivariate schemes like Rainbow promised small signatures and fast signing, but somewhat large public keys. Unfortunately, Rainbow was catastrophically broken in 2022: cryptanalysts found a key recovery attack that could recover a Rainbow private key in about 2 days on a single core (for the lowest security level). Specifically, Ward Beullens’ attack used a mix of algebraic techniques (like MinRank and differential analysis) to crack Rainbow’s structure. This attack showed that the security of Rainbow was overestimated – the “weekend on a laptop” break of Rainbow essentially ended its viability. Rainbow’s failure reaffirmed the cautious approach: even though multivariate cryptography had some efficient designs, they often proved brittle under cryptanalysis.

Another multivariate candidate, GeMSS, also had weaknesses and did not advance. As a result, NIST did not select any multivariate scheme. The lesson is that MQ-based systems, while enticing for their performance, have a history of surprise breaks, so none made the cut. Dilithium and Falcon, by contrast, withstood all cryptanalysis and had more confidence.

PICNIC (Symmetric-based Signature)

Picnic is an example of a “zero-knowledge proof” signature, using symmetric ciphers and MPC (Multi-Party Computation) in zero-knowledge to sign messages (it’s based on the security of block ciphers and hash functions). Picnic had relatively large signatures (several tens of KB) and slow performance compared to Dilithium or Falcon. Its approach was very different (involving producing a transcript proving you know a secret key of a block cipher without revealing it). NIST did not select Picnic, probably because SPHINCS+ covers the “symmetric security” niche with arguably more straightforward design (hash trees instead of complex ZK proofs) and because Picnic signatures were quite large and slower to verify. There were also newer variants like Picnic2/3 and improvements, but ultimately, SPHINCS+ was seen as the more conservative symmetric-based option for standardization.

SIKE (Isogeny-based KEM)

SIKE (Supersingular Isogeny Key Encapsulation) was an elegant candidate based on algebraic structures of elliptic curve isogenies. It had extraordinarily small keys (≈ 264 bytes public key) – the smallest of all PQC candidates – and moderate size ciphertexts, making it attractive for protocols with tight bandwidth.

It withstood cryptanalysis until late in the competition, when in 2022 a breakthrough classical attack by Castryck and Decru exploited a mathematical weakness in SIKE’s instantiation, breaking SIKE’s security in under an hour on a single core. This was a shock, as SIKE’s security was believed to rely on a hard problem related to supersingular isogeny graphs. The attack showed a flaw in the specific parameters (it reduced the problem to a hidden isogeny problem that could be solved via genus 2 curves).

With SIKE broken and multivariate schemes broken, the pool of diverse approaches shrank. NIST’s analysis in 2022 explicitly mentions that “multivariate and isogeny-based algorithms have been recently shown vulnerable to powerful classical attacks”, reinforcing their decision to rely on lattice and hash-based schemes for now.

In summary, Kyber vs NTRU/Saber: all lattice KEMs, similar security – Kyber was chosen for slightly better all-around performance and simplicity; Dilithium vs Rainbow/Picnic: Dilithium’s lattice basis proved solid while Rainbow fell to algebraic attacks and Picnic was less efficient; SPHINCS+ vs Others: SPHINCS+ was chosen over Picnic as the stateless hash-based representative for its simpler design (though both have large signatures). Code-based schemes (BIKE, HQC, McEliece) remain contenders for future standardization, especially to hedge against any unforeseen advances against lattice schemes. For now, NIST’s first standards cover a lattice KEM (Kyber), a lattice signature (Dilithium, plus Falcon as alternate), and a hash-based signature (SPHINCS+), as these offered the best mix of security and practicality among the candidates.

Implementation Considerations and Real-World Deployment

Transitioning to post-quantum algorithms raises various practical considerations: performance on different platforms, integration into existing protocols, and ease of adoption.

Software Performance and Benchmarks

The reference implementations of Kyber and Dilithium are in C and already run efficiently on standard processors. Optimized versions use vector instructions (AVX2) to speed up the NTT and hashing. For example, on an Intel Haswell CPU, Kyber-512 key generation/encryption/decryption can be done in ~0.03/0.045/0.06 million cycles respectively with AVX2 optimizations – that’s only a few tens of microseconds. Dilithium2 signing/verification might take a few hundred thousand cycles each, which is also on the order of microseconds. These numbers indicate that on desktops and servers, PQC will not be a bottleneck; the cost per operation is similar to or less than the cost of a TLS handshake’s other operations (like network latency or symmetric cipher overhead).

On smaller devices (IoT microcontrollers), the picture is more mixed: lattice operations require more memory (e.g. to store the matrix $$A$$ or big arrays for NTT). However, experiments on ARM Cortex-M4 (a common microcontroller) show Kyber-512 keygen takes ~655k cycles and encapsulation ~865k cycles (in a clean implementation), which at 24 MHz is about 27 ms and 36 ms, respectively – possibly acceptable for low-rate operations like a key agreement.

There are also optimized implementations that can significantly cut these numbers. Hardware acceleration is a promising area: Because lattice cryptography is basically matrix math and hashing, hardware like FPGAs or ASICs can accelerate them using parallelism. Studies have designed shared hardware IP cores for NTT that can be used by both Kyber and Dilithium, since both need polynomial multiplication. An FPGA implementation of Dilithium was demonstrated with very low latency (achieving thousands of signatures per second on modest hardware).

Similarly, integrating PQC into cryptographic libraries (like OpenSSL, wolfSSL) is underway – projects like Open Quantum Safe (liboqs) provide ready-to-use implementations to experiment with PQC in TLS, SSH, etc.

Protocol Integration (TLS, IPsec, etc.)

A major aspect of adoption is incorporating these algorithms into existing security protocols. The good news is that schemes like Kyber and Dilithium were designed to be drop-in replacements for current public-key algorithms. For instance, TLS 1.3 can use Kyber in place of (or in addition to) X25519 for key exchange. Draft RFCs exist for hybrid key exchanges in TLS that combine X25519 + Kyber (as a transitional approach). In such a hybrid, the client and server perform both an ECDH and a Kyber encapsulation and combine the results to derive the shared secret – this way, even if one algorithm is broken, the session remains secure if the other holds.

Cloudflare and Google conducted experiments where browsers and servers successfully negotiated TLS 1.3 with Kyber; they reported nearly 2% of Cloudflare’s TLS connections were using a post-quantum hybrid in late 2022. For VPN protocols like IPsec/IKE or WireGuard, similar integration is possible: one can define new cipher suites that specify a PQ KEM for key establishment. The main challenge is handling the bigger message sizes. In TLS, handshake messages might grow beyond one packet, but TLS has fragmentation and can handle that.

Another consideration is certificate signatures in protocols: a TLS server certificate signed with Dilithium will need the client to support verifying Dilithium. This means TLS implementations must include Dilithium verification code and certificate authorities may need to begin issuing PQC-signed certificates (or dual signatures).

The industry is already preparing: some CAs have tested issuing certificates with hybrid ECDSA+Dilithium signatures to maintain compatibility. Standards bodies like IETF have drafts for encoding PQC keys and signatures in protocols (e.g., for COSE/JOSE in JSON or CBOR web tokens, etc.). IPsec can accommodate PQC in its IKE key exchange (there’s interest in using KEMs in IKEv2). In SSH, OpenSSH 9.3 added support for using Streamlined NTRU Prime (ntrup≈761) combined with X25519 for key exchange, paving the way for adding Kyber as well. So, compatibility is largely a software update issue – the protocols were designed to be crypto agile and can carry larger keys as needed.

Challenges in Real-World Adoption

One challenge is standardization and interoperability. Until formal standards (FIPS, RFCs) are published, widespread adoption is slow. NIST is expected to publish draft standards (FIPS 203, 204, etc.) for these algorithms by 2024, and then organizations can begin requiring compliance.

Another challenge is trust and testing: PQC algorithms are new, so developers might worry about side-channel attacks or implementation bugs. For instance, lattice algorithms involve a lot of arithmetic that must be constant-time to avoid leaking secrets via timing or cache patterns. Ensuring all implementations are side-channel resistant (against timing, power analysis, fault injection) is an ongoing effort. Some research has already looked at things like using bit-slicing or arithmetic blinding to protect implementations. Because Kyber and Dilithium have had several years of public scrutiny, by the time of standardization they’ll have hardened implementations (in C and assembly) and perhaps formal verification of certain properties.

Another adoption issue is ecosystem inertia: systems like smart cards, TPM chips, etc., which currently have ECC/RSA hardware, will need new firmware or hardware to support PQC. There’s a cost associated with upgrading those. In some cases, hybrid solutions are proposed (e.g. use classical crypto as a backup in case a client doesn’t support PQC). The US government has directives (like a recent White House memo) pushing agencies to inventory their cryptographic usage and prepare to roll out quantum-safe solutions in the next few years. This top-down pressure will drive vendor support.

Benchmarks on Different Platforms

Preliminary benchmarks show Kyber and Dilithium performing well not just on Intel/AMD CPUs but also on ARM (smartphones) and even smaller ARM Cortex-M. For instance, on an ARM Cortex-M4 @ 24MHz, Dilithium2 can sign in about 75ms and verify in 13ms (using 16KB of RAM) – which might be acceptable for some IoT devices. Kyber-512 key exchange on the same platform might take ~35ms for encapsulation and ~45ms for decapsulation. These numbers can likely be improved, and on faster IoT cores (say 100MHz Cortex-M7) it’d be sub-10ms. There’s active work on optimizing PQC for embedded (including using DSP instructions for NTT, etc.) Hardware implementations (ASIC) could potentially perform a Kyber encapsulation in under 1 microsecond in the future, which would far outperform software. This opens the door to PQC in high-throughput environments (like securing 5G links or data center traffic) with minimal overhead.

Memory and Network Considerations

While CPU overhead is manageable, the increased sizes affect memory (for storing keys) and network (for transmitting them). On memory: a device that stored thousands of ECC keys might need to allocate more space if those become Dilithium keys (which are ~2KB each). On network: protocols with very tight MTU (max transmission unit) limits might need to adjust. For example, if a handshake message grows beyond 1500 bytes, it could cause IP fragmentation on networks with small MTUs. This is rarely a problem on modern networks for the sizes we talk about (2–3 KB), but it’s something to test. Satellite links or constrained radio networks might care about every byte; in those cases, schemes like Falcon (small sig) or even Classic McEliece (small ciphertext) could be considered, but each has trade-offs.

Backward Compatibility

During a transition, not everyone will support PQC at once. The likely strategy is a hybrid mode: for example, a TLS server could present both an RSA and a Dilithium certificate; a client that is PQC-aware will verify Dilithium (and ignore RSA or use it redundantly), while a legacy client will just use RSA and still connect (though without quantum-safe security).

Similarly, in key exchange, sending an extra PQC key share alongside the classical one ensures security for PQ-aware peers and compatibility for others. This dual operation naturally has overhead (performing two handshakes worth of crypto), but since these operations are fairly fast, it’s acceptable for a transition period.

In the long run (perhaps a decade from now), the classical algorithms can be phased out and only PQC kept.

Regulatory and Compliance

Organizations often wait for standards (FIPS validation, Common Criteria, etc.) before deploying new crypto. FIPS 140-3 modules will need to incorporate these algorithms and get certified. NIST will likely publish SP 800-185 (or similar) specifying how to derive keys, and algorithm specs as NIST FIPS or Special Publications. Once those are out, government and industry can include PQC in procurement requirements. We might see, for instance, NSA’s Suite B (which was ECC) replaced by a Suite (whatever letter) specifying Kyber and Dilithium for national security systems. This top-level blessing will spur adoption.

Summary

In summary, implementing Kyber, Dilithium, SPHINCS+ is feasible with today’s technology. They generally require only standard operations (integer arithmetic, hashing) which are ubiquitous. Experimentation by companies like Cloudflare has shown that websites can enable PQC key exchange with minimal impact. The real-world challenges are primarily engineering issues: updating code, handling larger data, and ensuring no security regressions (like side-channel leaks or integration bugs).

We are still in the early phase of deployment – testing in parallel environments. Over the next few years, expect to see these algorithms integrated into SSL/TLS libraries, VPN clients, secure email tools, etc., initially as options and later as defaults.

One particular challenge is long-term signature verification: for example, code signing and digital signatures that must remain valid for decades might consider using SPHINCS+ despite its size, for its extremely robust security margin. Balancing such use-cases (maybe using dual-signatures: one lattice-based, one hash-based) could be a strategy for critical systems.

The community is actively working on reference integrations (IETF drafts for use in TLS, making sure protocols can negotiate these algorithms, etc.), so the path to adoption is steadily progressing.

Security Assumptions and Known Attack Vectors

Finally, we consider the security foundations of these algorithms and the known avenues for attacks (quantum or classical), along with how these are mitigated.

Resistance to Quantum Attacks

The primary design goal is to resist cryptanalytic attacks from quantum computers, notably Shor’s algorithm and Grover’s algorithm. Shor’s algorithm factors integers and computes discrete logs in polynomial time, which breaks RSA, DSA, DH, ECC, etc. However, Shor’s approach relies on the structure of finite fields or elliptic curve groups; it does not apply to the lattice problems or hash functions underlying our PQC algorithms. There is no known analogous quantum algorithm that can solve lattice problems (like finding short vectors or solving LWE) significantly faster than known classical algorithms – at best, Grover’s generic quantum search could quadratically speed up brute-force searches for secrets.

For hash-based schemes (SPHINCS+), a quantum adversary can attempt a preimage attack with Grover’s algorithm, which would reduce an $$N$$-bit hash’s security to $$N/2$$ bits. SPHINCS+ anticipates this by using 256-bit hashes to target ~128-bit post-quantum security (since Grover would make it about 128 bits).

Lattice schemes also consider quantum attackers in their parameter choices: when we say Kyber-512 targets ~AES-128 security, that already factors in the best known quantum attacks (which are basically the same lattice reduction algorithms with maybe minor polynomial overhead). In fact, lattice reduction (like BKZ algorithm) might see only modest speedups with quantum techniques (some research suggests a small polynomial speedup), so Kyber’s security estimates (around 110-120 bits for Kyber-512) remain valid.

NIST defined security categories where Level 1 means “at least as hard as breaking AES-128 with quantum advantage,” and all these schemes meet or exceed those levels by a comfortable margin. Therefore, no known quantum attack endangers Kyber, Dilithium, or SPHINCS+ – breaking them appears to require solving problems like finding a short lattice vector or a hash preimage, which even a quantum computer seems to require exponential time to do with current algorithms.

Classical Cryptanalysis

While quantum computers are the motivation, the schemes must also resist all classical attacks. For lattice schemes, the main attacks are lattice reduction algorithms such as BKZ, enumeration, sieving, etc., which aim to solve the underlying lattice problems (like finding the secret vector $$s_1$$ from the public $$A, t$$ in Dilithium, or solving LWE for Kyber). The parameters (dimension, modulus, noise) of Kyber and Dilithium were chosen after extensive analysis to ensure these attacks would require an astronomically large amount of computation (estimates > $$2^{140}$$ operations for security level 1, for example).

As a concrete measure, Kyber-512’s public key part has about 118-bit security against best known attacks, and its ciphertext part about 112-bit, which is above the 128-bit target when considering some attack nuances. Dilithium’s security relies on two related problems (MLWE and MSIS); its proofs in the random oracle model are not tight, but the team selected parameters conservatively to account for that and for advances in lattice algorithms.

There have been cryptanalysis efforts exploring things like hybrid attacks (combining meet-in-the-middle with lattice reduction) or exploiting the structure (like the module structure) – but so far no attack substantially better than general lattice reduction has been found. One must remain vigilant: lattice cryptanalysis is an active field, and as we push these schemes to wider use, new optimizations might emerge (for example, an improved BKZ variant or a clever trick to shave off a few bits of security). This is why NIST required a large security margin. It’s also why multiple lattice schemes were considered – diversity even within lattices (Module-LWE vs NTRU vs LWR) is useful in case one structure has a weakness.

For hash-based SPHINCS+, classical cryptanalysis would mean finding collisions or second preimages on the hash functions used. SHA-256 and SHAKE256 are very well-studied; while collisions have been found for SHA-1, none exist for SHA-2 or SHA-3 family at this time. Even if a classical collision attack on (say) SHA-256 existed (which would be shocking, but suppose), SPHINCS+ could be instantiated with another hash (the design is hash-agnostic, it even offers a variant with Haraka, a specialized hash). The security reduction for SPHINCS+ basically says: if an attacker forges a signature, they either found a preimage for one of the one-time hashes or forged a WOTS+ or FORS structure, which ultimately reduces to breaking the hash’s preimage/collision resistance. No structural weaknesses in SPHINCS+ have been found – its security was scrutinized and even formally verified in a proof assistant for certain instantiations.

In summary, classical threats are mostly brute-force or improve-upon-brute-force. For Kyber/Dilithium, that’s lattice attacks; for SPHINCS+, hash cryptanalysis. As of now, the chosen parameters defeat all known methods with a large safety margin. NIST’s confidence was reflected in statements that these schemes “have been continuously analyzed by experts for several years” and no effective attacks were found.

Side-Channel Attacks

One important class of attacks doesn’t break the math but the implementation. Side-channel attacks (timing, power analysis, fault injection) have been a concern. With RSA/ECC, we saw many side-channel defenses needed (constant-time multipliers, blinding, etc.). The new schemes likewise need care.

Timing attacks: Because lattice operations involve array indices and conditional reductions, an unprotected implementation might leak information via timing or cache access patterns. For example, a straightforward NTT implementation might have data-dependent memory access (bit-reversal patterns) that could be exploited. The reference implementations for Kyber and Dilithium are written to be constant-time (no secret-dependent branches or array indices) – and this has been checked by tools (as referenced in liboqs stats: they claim no secret-dependent branching for these). Early versions of some lattice schemes had potential cache side channels, but those have been mitigated with techniques like using CT (constant-time) butterflies in NTT and avoiding secret-dependent rejection sampling patterns.

Power analysis: Side-channel researchers have started analyzing these PQC schemes. One known attack on a different LWE-based KEM (NewHope) used a single trace of a decapsulation to recover the key by analyzing the noise patterns – but Kyber’s design (with FO transform and error reconciliation) is somewhat different and more robust. Still, implementing these on small devices will require adding countermeasures like masking (randomizing the representation of secret polynomials) if adversaries can measure power or EM emissions. Fault attacks could also be considered (e.g. flipping a bit during decapsulation to see if a verify passes or fails, which might reveal information). The FO transform actually helps here: even if a fault causes decapsulation to produce the wrong key, the attacker doesn’t learn secret info unless they can observe some comparison or output difference. There is active research in developing differential fault analysis on lattice algorithms, but it’s early.

For SPHINCS+, side channels are a bit less of an issue – it’s mostly hash computations, which can be made constant-time easily, and there is no secret key used in a repeated manner across many operations (each signature uses fresh randomness and one-time keys). A timing attack on a hash function is unlikely to leak key bits (since the key is just a seed that expands internally). But one must ensure not to reuse seeds or counters in a way that could break the scheme. Because SPHINCS+ is stateless by design, it includes a randomness input for each signature to avoid any chance of two signatures using the same one-time key. As long as the implementation uses a good source for that or a proper PRF from the secret seed with a unique nonce, it should be fine.

Decryption Failure Attacks

Lattice KEMs sometimes have a non-zero probability of decryption failure (when the added noise plus message causes a wrong rounding). If these failures can be detected by an attacker (e.g., by trying many ciphertexts and seeing which fail to decrypt correctly), it could leak info about the secret.

Kyber was designed with an extremely low failure rate (like $$10^{-164}$$ for Kyber768), so in practice failures should never happen during normal operation. Moreover, Kyber’s decapsulation does a hash of the ciphertext to derive a backup key when the check fails, so it doesn’t explicitly signal a failure – this strategy (introduced by FO transform) means an attacker can’t directly distinguish a decrypt failure from a success (both yield a key, just that one is “correct” and one is pseudorandom). This prevents an attacker from gaining an oracle that tells them “did decryption succeed?”, closing a potential side channel. Thus, chosen-ciphertext attacks that rely on inducing failures (like some proposed against earlier NIST candidates) are mitigated by design in Kyber.

Known Weaknesses or Open Questions

As of their selection, none of Kyber, Dilithium, or SPHINCS+ have known structural weaknesses. However, cryptographers remain cautious. For lattices, a big open question is: How secure are structured lattices (like module lattices) in the long term? We have confidence based on reduction proofs (module-LWE reduces to standard worst-case lattice problems) and extensive cryptanalysis, but these problems haven’t been around for centuries like factoring. Perhaps new math insights (e.g. a better way to solve the SIS problem) could emerge, though it’s unlikely to be a complete break – more likely an incremental improvement requiring parameter tweaks. That’s why NIST chose somewhat conservative parameters (e.g., targeted ≥ 256-bit classical security for highest level instead of just 128-bit).

For hash-based signatures, one open research topic is optimizing them – SPHINCS+ chose not to require state, but stateful hash-based signatures like XMSS are already standardized in RFCs. Those have smaller signatures (around 2-3KB) but require keeping track of usage. In specialized contexts (like certificate authorities signing CRLs or OS updates), a stateful scheme might be acceptable and more efficient. SPHINCS+ basically trades larger signatures to eliminate the need for state. As computing and storage improve, a 10KB signature might be trivial to handle, so stateless schemes could dominate. If someone found a slight weakness in, say, the way SPHINCS+ does its few-time signatures (FORS) or an issue with pseudorandom key generation (some bug in instantiation), that would be a problem – but so far nothing like that has turned up in extensive analysis.

Another area of active research is combined attacks – for instance, what if an attacker has a small quantum computer and a large classical computer? Could they do a meet-in-the-middle where part of the work is done quantumly (to shave off a square-root) and the rest classically? This is being studied for lattice schemes. The security categories already kind of assume the worst-case quantum speedups. There’s also the consideration of multi-target attacks: if an attacker targets many keys at once (like a million public keys), do the schemes lose security? Hash-based signatures do have to consider multi-target (collision finding among many hash outputs), but SPHINCS+ parameters assume an adversary might try to attack many signatures. Lattice schemes assume an attacker focuses on one key at a time; however, if an attacker can amortize work over many public keys, in some cases that can give a small advantage. Fortunately, known lattice attacks (like dual lattice attacks) usually don’t get huge gains from multi-target scenarios without significant trade-offs.

Post-Quantum Safety of Symmetric Primitives

All these schemes also rely on traditional symmetric crypto (SHA-3, AES, etc., for their internal hash/Pseudorandom functions). It’s assumed those are secure (against both classical and quantum). If a weakness in SHAKE-256 or SHA-256 was discovered, it could affect SPHINCS+ or Dilithium (which uses SHAKE-256 extensively). This is considered very unlikely given the scrutiny of SHA-2/3, but cryptography is never 100% certain. In PQC, a belt-and-suspenders approach is sometimes suggested: e.g., Dilithium uses two different hash functions (SHAKE for most things, but also possibly AES in an alternative “-AES” version) to diversify implementation. SPHINCS+ offers “simple” vs “robust” variants; the robust variant includes hashed masks to protect against any unexpected structure in the hash function output. The trade-off is speed (robust is slightly slower). NIST is standardizing the robust versions of SPHINCS+.

Ongoing Analysis

The PQC algorithms will continue to be analyzed publicly. Already, after the announcement, researchers worldwide are hammering on Kyber and Dilithium to either prove security reductions in tighter models or find any weaknesses. So far, this has increased confidence – for example, a recent work provided a tight security proof for SPHINCS+ (removing some minor flaws in an earlier proof). Another examined Dilithium in the quantum random oracle model to confirm its security (Fiat-Shamir with aborts needed some careful analysis in the QROM, and results show it remains secure). Side-channel academia has begun creating prototypes of attacks and patches. It is expected that by the time these are deployed at scale, they will have gone through one or two more rounds of hardening.

In conclusion, Kyber, Dilithium, and SPHINCS+ currently have no known feasible attack vectors – neither quantum algorithms nor classical cryptanalytic techniques can break them with acceptable resources. Their security rests on well-studied problems: lattice problems that have withstood decades of research, and hash functions that are standard and trusted. The biggest “attacks” so far have been implementation-specific or require unrealistic capabilities. For instance, an attacker with a large quantum computer might speed up brute force, but the parameters already account for that (hence larger key sizes than one might use classically). If a breakthrough did occur (say someone finds a sub-exponential algorithm for lattices), there are fallback options: bigger lattices or other PQ families (code-based, etc.). NIST’s plan is to diversify and standardize multiple algorithms to hedge bets. That’s why having SPHINCS+ (hash-based) is important – even if, in an extreme scenario, all number-theoretic PQ schemes fell, SPHINCS+ would remain as a security net (with the cost of large signatures).

The known weaknesses that eliminated other candidates (like Rainbow’s breakdown or SIKE’s crack) do not apply to the NIST selections – those algorithms either lacked the kind of trapdoor structure attackers exploited or had much more scrutiny. NIST’s selections were praised by experts as being the “low-risk” choices: Kyber and Dilithium in particular have very high security confidence. Of course, prudence dictates continuing to evaluate them. Areas like quantum cryptanalysis (attempting to actually implement and test quantum algorithms on smaller instances) will progress – giving more evidence that no unforeseen quantum attack exists. Over time, assuming no big surprises, these algorithms will become as trusted as RSA/ECC are today. The migration is already in motion, marking a historic shift in cryptography to a post-quantum future.

Marin Ivezic March 28, 2023

44 minutes read

Table of Contents

Introduction

CRYSTALS-Kyber: Lattice-Based Key Encapsulation Mechanism

Key Generation

Encapsulation (Encryption)

Decapsulation (Decryption)

Fujisaki-Okamoto (FO) Transform for CCA Security

Kyber Performance and Parameters

CRYSTALS-Dilithium: Lattice-Based Digital Signatures

Key Generation

Signature Generation

Verification

Dilithium Parameters and Efficiency

SPHINCS+: Stateless Hash-Based Signatures

High-Level Structure

Key Generation

Signature Generation

Putting it together

Security

Comparison with Classical Cryptography (RSA, ECC)

Security Basis

Key Sizes and Structures

Efficiency (Speed)

Ciphertext/Signature Overhead

Certificates and Protocol Integration

Summary

Comparison with Other Post-Quantum Candidates (not selected by NIST)

NTRU and NTRU Prime (Lattice – NTRU Family)

Saber (Lattice – Module-LWR)

BIKE and HQC (Code-Based KEMs)

Rainbow (Multivariate Signature)

PICNIC (Symmetric-based Signature)

SIKE (Isogeny-based KEM)

Implementation Considerations and Real-World Deployment

Software Performance and Benchmarks

Protocol Integration (TLS, IPsec, etc.)

Challenges in Real-World Adoption

Benchmarks on Different Platforms

Memory and Network Considerations

Backward Compatibility

Regulatory and Compliance

Summary

Security Assumptions and Known Attack Vectors

Resistance to Quantum Attacks

Classical Cryptanalysis

Side-Channel Attacks

Decryption Failure Attacks

Known Weaknesses or Open Questions

Post-Quantum Safety of Symmetric Primitives

Ongoing Analysis

Marin Ivezic

Related Articles

Breaking RSA Encryption: Quantum Hype Meets Reality (2022–2025)

Entanglement-Based QKD Protocols: E91 and BBM92

Quantum Computing Risks to Cryptocurrencies – Bitcoin, Ethereum, and Beyond

Q-Day Revisited – RSA-2048 Broken by 2030: Detailed Analysis