Post-Quantum, PQC, Quantum Security Quantum Snake Oil

The Shannon Hustle: How Vendors Abuse Perfect Secrecy to Sell You Less Than You Think

Marin Ivezic June 28, 2026

18 minutes read

I have seen variations of this pitch at least a dozen times in the past two years. A vendor presents an encryption product. Somewhere between slides seven and twelve, the word “Shannon” appears. The claim takes different forms: “achieves Shannon Perfect Secrecy,” “extends Shannon’s theorem to reusable keys,” “provides information-theoretic security equivalent to the one-time pad”; but the structure is identical every time. The product uses keys far shorter than the messages it encrypts. Shannon’s 1949 theorem says that is impossible under his definition of perfect secrecy. One of these two things is wrong.

I have examined the mathematics behind every such claim I have encountered. In every case, the answer is the same: the vendor has either quietly changed Shannon’s definition, confused the size of the key space with the entropy Shannon’s theorem requires per encryption operation, or proved a security property in a restricted formal model and marketed it as if it holds in general deployment. The resulting product may or may not provide good computational security. But it does not provide what the word “Shannon” is meant to imply, and the gap between the implication and the reality is where the hustle lives.

This article lays out the mathematics of what Shannon proved, catalogs the five most common patterns of Shannon abuse I have identified in vendor marketing and the associated academic literature, and provides a five-question evaluation framework any procurement officer can apply before signing. The four entries in my Quantum Snake Oil Dictionary covering information-theoretic security, perfect secrecy, unconditional security, and operational perfect secrecy cover the individual terms. This article covers the pattern.

What Shannon Actually Proved

Claude Shannon’s 1949 paper “Communication Theory of Secrecy Systems,” published in the Bell System Technical Journal, is 60 pages long. The marketing claims that invoke his name typically depend on the reader not having read any of them. The relevant result fits in a few paragraphs, and understanding it requires nothing beyond probability at an undergraduate level.

Shannon defined a cipher as perfectly secret if observing the ciphertext gives the adversary exactly zero information about the plaintext. Formally, for every possible plaintext $$m$$ and every possible ciphertext $$c$$:

$$$Pr[M = m \mid C = c] = Pr[M = m]$$$

The probability that the message is $$m$$, given that you have observed ciphertext $$c$$, equals the probability that the message is $$m$$ before you observed anything. The ciphertext is statistically independent of the plaintext. Intercepting it is the same as not intercepting it.

In information-theoretic terms, this is equivalent to saying the conditional entropy of the message given the ciphertext equals the entropy of the message itself:

$$$H(M \mid C) = H(M)$$$

Or equivalently, the mutual information between message and ciphertext is zero:

$$$I(M; C) = 0$$$

These three formulations say the same thing in different notation. The adversary’s uncertainty about the message does not decrease by any amount, no matter what ciphertext they observe, no matter how much computation they perform, no matter how much time they have. The guarantee is unconditional. It holds against adversaries with infinite computing power, including quantum computers, including anything that might be invented in the future. That is what makes information-theoretic security worth invoking. And that is what makes the conditions for achieving it so severe.

Shannon then proved the following theorem. Any cipher achieving perfect secrecy must satisfy:

$$$|K| \geq |M|$$$

where $$|K|$$ is the number of possible keys and $$|M|$$ is the number of possible messages. In entropy terms: $$H(K) \geq H(M)$$. The key must carry at least as much entropy as the message.

The proof is clean and worth walking through, because understanding why the bound holds is the best defense against vendors who claim to have beaten it.

Suppose, for contradiction, that $$|K| < |M|$$. Pick any ciphertext $$c$$ that occurs with positive probability. The set of messages that could have produced $$c$$ is at most $${D_k(c) : k \in K}$$ where $$D_k$$ is the decryption function under key $$k$$. This set has at most $$|K|$$ elements. Since $$|K| < |M|$$, there exists at least one message $$m^\ast$$ that is not in this set — no key maps $$m^\ast$$ to $$c$$. For this message, $$Pr[M = m^\ast \mid C = c] = 0$$. But if $$m^\ast$$ has any nonzero prior probability, then $$Pr[M = m^\ast] > 0 \neq Pr[M = m^\ast \mid C = c]$$. The ciphertext told the adversary something: the message is definitely not $$m^\ast$$. Perfect secrecy is violated.

A concrete example makes this tangible. Suppose messages come from an alphabet of four symbols: A, B, C, D. Suppose the key space has only two keys: $$k_1$$ and $$k_2$$. For any ciphertext the adversary intercepts, at most two of the four messages could have produced it. The adversary immediately eliminates at least two candidates. If the messages were equally likely before interception (each with probability 1/4), the remaining candidates now each have probability 1/2. The adversary’s uncertainty dropped from 2 bits to 1 bit. That single bit of information leakage is the entire difference between Shannon’s definition and everything the hustle sells.

Shannon also proved that the one-time pad (OTP) achieves perfect secrecy when three conditions are met: the key is as long as the message, the key is generated by a truly random process, and each key is used exactly once. These conditions are not engineering preferences. They are mathematical consequences of the definition. The next section examines what happens when each one is relaxed.

The Three Conditions and What Relaxing Each One Costs

Condition 1: Key Length ≥ Message Length

This is the condition vendors most want to escape, because it makes practical deployment expensive. Distributing and managing keys as long as every message you will ever send is an operational burden most organizations cannot bear, which is exactly why the one-time pad, despite being the only provably unbreakable cipher, is used almost nowhere outside of a few intelligence applications.

The proof above shows why this condition cannot be negotiated away. If the key is shorter than the message, the decryption function cannot map each ciphertext back to every possible message, and the ciphertext necessarily excludes some plaintexts. The adversary gains information. How much information depends on the ratio between key length and message length, but the gain is nonzero for any ratio less than 1:1.

Vendors sometimes argue that their key material, while shorter than the message, generates a very large key space, millions or billions of possible keys, and that this combinatorial explosion provides security equivalent to a longer key. This confuses two distinct concepts. The size of the key space determines how hard it is to guess the key by brute force, which is a computational security property. The entropy of the key per encryption operation determines whether Shannon’s theorem applies. A cipher might select from a space of $$10^{506}$$ possible permutations (an enormous combinatorial object), but if the operational key material communicated between parties and consumed per encryption is 256 bits while the message is a megabyte, the scheme is a symmetric cipher with a 256-bit key. It may be a perfectly good symmetric cipher. But invoking Shannon to describe it is like invoking Pythagoras to describe a curve.

Condition 2: True Randomness

Shannon’s proof assumes the key is drawn from a distribution with full entropy. In practice, many schemes generate a short random seed and then expand it using a pseudorandom number generator (PRNG) or a key derivation function to produce a keystream as long as the message. This is standard cipher engineering: it is how AES in counter mode works, how ChaCha20 works, and how most stream ciphers operate.

The expansion does not change the entropy. If the seed is 256 bits, the expanded keystream carries at most 256 bits of entropy regardless of its length. An adversary with unlimited computing power can enumerate all $$2^{256}$$ seeds, expand each one, and determine which one matches the observed ciphertext. The scheme’s security rests entirely on the assumption that this enumeration is computationally infeasible. That is computational security, and it can be excellent: 256-bit security against brute force is beyond anything currently attackable, classically or quantumly (Grover’s algorithm halves the effective key length, giving 128-bit quantum security, which is still insurmountable).

But it is not information-theoretic security. The moment a PRNG enters the picture, the scheme’s security guarantee is bounded by the hardness of the PRNG, not by Shannon’s theorem. Calling the result “Shannon Perfect Secrecy” is technically incorrect regardless of how strong the PRNG is.

Condition 3: Single Use

Each key must be used exactly once. Reusing a one-time pad key against two messages $$m_1$$ and $$m_2$$ produces ciphertexts $$c_1 = m_1 \oplus k$$ and $$c_2 = m_2 \oplus k$$. XORing the ciphertexts yields $$c_1 \oplus c_2 = m_1 \oplus m_2$$. The key cancels. The adversary now has the XOR of the two plaintexts, which, for natural-language messages, is enough to recover both using standard crib-dragging techniques. This attack is elementary, well-documented, and has been successfully exploited against real systems (the Venona project decrypted Soviet diplomatic traffic encrypted with reused one-time pads).

Any scheme that reuses key material across encryptions must derive its security from something other than Shannon’s theorem. That something is typically a computational hardness assumption, the same foundation that underpins every standard symmetric and asymmetric cipher in use today.

The combined takeaway: relaxing any one of Shannon’s three conditions converts the security guarantee from information-theoretic to computational. This is not a minor downgrade in academic taxonomy. It means the scheme’s security depends on assumptions about what an adversary cannot compute, rather than on what an adversary cannot know. The assumptions might be strong. The scheme might be good. But it occupies a different category than the one Shannon’s name implies.

A Taxonomy of Shannon Abuse

Over the past several years, I have encountered five distinct patterns in which the academic literature and vendor marketing invoke Shannon’s name for schemes that do not meet Shannon’s conditions. Each pattern employs a different mechanism to bridge the gap between the claim and the math.

Pattern 1: The Qualifier Insertion

The most common pattern is to add a qualifying word before “perfect secrecy” and hope the reader does not notice that the qualifier changes the definition. “Operational perfect secrecy.” “Extended Shannon secrecy.” “Practical perfect secrecy.” “Quantum perfect secrecy.” Each of these compounds borrows Shannon’s authority while weakening his guarantee.

“Operational perfect secrecy” is the cleanest example. The term does not appear in Shannon’s paper or in any subsequent textbook on information theory that I have been able to find. It has been coined by vendors or vendor-affiliated researchers to describe schemes that achieve something like perfect secrecy under certain operational conditions: typically conditions that restrict the adversary to polynomial-time computation, or that assume the adversary cannot observe certain side channels, or that bound the number of messages encrypted under one key.

Any of those restrictions may be reasonable engineering assumptions. But they convert the guarantee from information-theoretic to computational or conditional. “Operational perfect secrecy” is to “perfect secrecy” what “operational profit” is to “profit”, a narrower claim that sounds like the broader one. I covered this in my Quantum Snake Oil Dictionary entry on the term.

Pattern 2: The Entropy Confusion

This is the most technically interesting pattern and the one most likely to confuse a sophisticated reader. It works by conflating the entropy of the key selection space with the entropy Shannon’s theorem requires per encryption operation.

Consider a scheme that selects a “pad” from a large combinatorial group. A published example: selecting 16 permutation matrices over the 256 possible values of an 8-bit byte. The number of such permutations is $$256!$$, a figure on the order of $$10^{506}$$, and selecting 16 of them independently draws from a space carrying roughly 27,000 bits of equivalent entropy ($$16 \times \log_2(256!) \approx 26{,}944$$ bits). Papers describing such schemes sometimes state this entropy figure prominently, then claim that the scheme “holds” this many bits of “Shannon entropy.”

The entropy of the selection space is a real and meaningful quantity. But it answers a different question than the one Shannon’s theorem asks. Shannon requires that the key entropy per encryption operation be at least as great as the message entropy. If the selected pad is applied to a 1-megabyte message, the question is whether the pad carries 8 million bits of independent entropy applied to those 8 million bits of message, not whether the pad was chosen from a space with 27,000 bits of combinatorial entropy. The pad might provide excellent computational security (an adversary trying to guess which permutation matrices were selected faces an enormous search space), but the information-theoretic guarantee is bounded by the operational key entropy, not the selection entropy.

The confusion is often genuine rather than deliberate. Researchers working in combinatorics or quantum information theory may use “Shannon entropy” in its information-theoretic sense, the entropy of a random variable; and apply it correctly to the random variable representing the key selection. The marketing team then interprets “Shannon entropy” as a reference to Shannon’s secrecy theorem, and the claim migrates from a correct statement about combinatorial complexity to an incorrect statement about information-theoretic security.

Pattern 3: The Domain Switch

A third pattern proves a legitimate security property in one formal model and then markets the result as if it holds in the model where customers actually deploy encryption. The most prominent examples involve quantum ciphertext.

Several published papers prove that if the ciphertext is transmitted as a quantum state rather than a classical bitstring, certain security properties become achievable with shorter keys. This is correct. Shannon’s theorem is a theorem about classical communication: it assumes the adversary receives a classical ciphertext and can store and analyze it indefinitely. Quantum ciphertexts behave differently: the no-cloning theorem prevents the adversary from copying them, measurement disturbs them, and the adversary must choose what to measure without the option of trying again. These physical constraints change the information-theoretic calculus.

One recent paper on arXiv (2408.09088) explicitly claims to “overcome Shannon’s theorem to achieve perfect secrecy with reusable keys.” Read carefully, the claim is technically scoped: the scheme requires a quantum communication channel, the ciphertext is a quantum state, and the security holds under quantum-mechanical assumptions about what the adversary can do with that state. These are legitimate results in quantum information theory.

The marketing problem arises when the scheme is presented to enterprise customers who use TCP/IP, TLS, standard networking equipment, and classical storage. Every real-world deployment I have seen described in vendor literature transmits ciphertexts as classical bits over classical networks. In that setting, Shannon’s classical theorem applies in full force, and the quantum ciphertext results do not rescue the claim. The security property proved in the quantum model does not transfer to the classical deployment model. The vendor is citing a theorem that applies in a world different from the one the customer inhabits.

Pattern 4: The “Overcomes Shannon” Claim

Some papers and vendor materials go further than the domain switch and directly claim to have “overcome,” “extended,” or “gone beyond” Shannon’s theorem. I flag this as a distinct pattern because the framing implies that Shannon was wrong, or that his result has been superseded, rather than that the scheme operates outside his theorem’s scope.

Shannon’s theorem has not been overturned. It remains true in the classical model under the original definition, and it has been true for 77 years without a single counterexample. Results that achieve shorter keys do so by changing the model (quantum ciphertext), changing the definition of security (entropic security, discussed below), or changing the adversary’s capabilities (computational bounds). Each of these is a valid research direction. None of them “overcomes” Shannon, any more than building an airplane “overcomes” gravity. The airplane operates by engaging with different physical principles (lift, thrust); gravity is still there. Shannon’s theorem is still there. The vendor who tells you they have overcome it is either confused about the scope of their own result or hoping you will be.

Pattern 5: The Entropic Security Bait-and-Switch

Entropic security, introduced by Russell and Wang (2002) and developed by Dodis and Smith (2005), is a legitimate and well-studied relaxation of Shannon’s definition. The core idea: if the plaintext distribution has sufficiently high min-entropy from the adversary’s perspective, then an encryption scheme with keys shorter than the message can still provide a meaningful information-theoretic security guarantee — not perfect secrecy, but a guarantee that the ciphertext does not help the adversary predict any function of the plaintext.

This is a real result, published in top venues, with rigorous proofs. It is also narrower than it sounds. The key requirement is the min-entropy assumption: the adversary must already be highly uncertain about the plaintext before seeing the ciphertext. Specifically, the key length must satisfy $$\ell_k \geq n – t + 2\log(1/\epsilon)$$ where $$n$$ is the message length in bits, $$t$$ is the min-entropy of the plaintext from the adversary’s viewpoint, and $$\epsilon$$ is the security parameter. If the plaintext is structured data (database records, financial transactions, natural-language text, source code, configuration files, most of what enterprises actually encrypt), the min-entropy is low, $$t$$ is small, and the key length required approaches the message length. The saving is real only when the plaintext is already nearly random.

The bait-and-switch works by marketing entropic security without the min-entropy caveat. A vendor describes their scheme as “information-theoretically secure with short keys” and cites the academic literature correctly. The missing sentence is: “…provided the plaintext has min-entropy close to its bit length.” That sentence eliminates most enterprise use cases. Encrypted database fields, log entries, API payloads, email bodies, and document files all have extensive structure that reduces their min-entropy far below their bit length. For these data types, the entropic security guarantee weakens proportionally, and the scheme falls back to relying on whatever computational hardness assumption underlies its key expansion or permutation mechanism.

Entropic security is a useful concept: for encrypting random keys, nonces, or high-entropy seeds, it can provide real information-theoretic guarantees with practical key sizes. For encrypting structured data, which is the majority of enterprise encryption, the caveat is the entire story, and omitting it from the sales pitch is the entire hustle.

The Shannon Vendor Evaluation Framework

When a vendor claims Shannon-level, information-theoretic, or unconditionally secure encryption, the following five questions will determine whether the claim holds. They are ordered as a decision tree: a “no” at any step means the scheme is computationally secure (which can still be strong), but the vendor’s use of “Shannon” or “information-theoretic” is incorrect.

Question 1: The Length Test

Is the key material consumed per encryption operation at least as long as the plaintext?

Ask for the key size in bits and the maximum plaintext size in bits for a single encryption operation. If the plaintext can be longer than the key (or if the scheme encrypts multiple messages under the same key material), Shannon’s theorem prohibits perfect secrecy.

Do not accept the size of the key selection space or the combinatorial entropy of the key generation process as a substitute for the per-operation key length. The question is about the operational key, not the space it was drawn from.

A “no” here means the scheme provides computational security. Proceed to evaluate it on those terms: key length, algorithm design, implementation quality, side-channel resistance, as you would any symmetric cipher.

Question 2: The Randomness Test

Is the key generated by a source with full entropy, or is it expanded from a shorter seed?

If the key is expanded from a shorter seed using a PRNG, stream cipher, key derivation function, or any deterministic process, the scheme’s entropy is bounded by the seed length. The expansion introduces a computational assumption (that the expansion function is indistinguishable from random to a bounded adversary). This is fine engineering: it is how AES-CTR and ChaCha20 work, but it is computational security, not information-theoretic security.

Ask specifically: “If I trace the key material back to its source of randomness, how many truly random bits does the scheme consume per encryption operation?” If the answer is less than the plaintext length, the scheme does not achieve perfect secrecy.

Question 3: The Reuse Test

Is each key used for exactly one encryption and then destroyed?

If key material is reused across multiple messages, the scheme is not a one-time pad and cannot claim perfect secrecy. A “reusable” key is a standard symmetric key, and the scheme should be evaluated as a standard symmetric cipher.

Some schemes describe themselves as using “reusable keys” while claiming perfect secrecy. This is a contradiction under Shannon’s definition. If the vendor cites a result that achieves perfect secrecy with reusable keys, check whether the result requires quantum ciphertext (Pattern 3) or a restricted adversary model, and whether the deployment uses classical networks and faces unrestricted adversaries.

Question 4: The Theorem Test

If the scheme does not meet Shannon’s three conditions, which specific published theorem provides the security guarantee, in which formal model, and with what assumptions about the adversary and the plaintext distribution?

This is the question that separates honest innovation from name-dropping. Legitimate schemes operating outside Shannon’s conditions have formal security theorems with stated assumptions. Entropic security results require a min-entropy bound on the plaintext. Quantum ciphertext results require a quantum communication channel. Computational security results require a hardness assumption (e.g., the security of AES, the hardness of LWE).

Ask the vendor to identify the specific theorem, the specific paper, and the specific assumptions. If the answer is “our internal analysis shows…” or “the combinatorial complexity of our key space provides…”, the scheme has not been proven secure by anyone outside the organization that sells it. Schneier’s rule applies: anyone can invent a cipher they themselves cannot break. That tells you something about the inventor, not about the cipher.

Question 5: The Cryptanalysis Test

Who outside the vendor’s organization has attempted to break this scheme, and where are the results published?

This is not a Shannon-specific test; it is the standard due-diligence test for any cryptographic product, and I have written about it extensively in my coverage of proprietary PQC algorithms and in Chapter 4 of Quantum Ready. But it is especially important when a scheme claims to exceed well-known theoretical bounds, because the claim’s extraordinary nature demands extraordinary evidence.

The baseline for comparison is what NIST’s standardized algorithms endured: five to eight years of public cryptanalysis by the world’s leading cryptanalysts, with roughly a third of initial candidates broken and several finalists eliminated in late rounds. ML-KEM (formerly CRYSTALS-Kyber), ML-DSA (formerly CRYSTALS-Dilithium), and SLH-DSA (formerly SPHINCS+) survived this process. That survival is the security evidence. Publication in a journal, even a good one, is not the same as surviving years of adversarial cryptanalytic attention. Journal reviewers check that the proofs are correct; they do not spend weeks trying to find attacks.

If the scheme has not been submitted to a standardization process, has not been published in an IACR-affiliated venue, and has not been the subject of independent cryptanalytic papers, the question is not “is it secure?” but “has anyone looked?”

When You Hear the Deflections

Vendors whose claims do not survive these five questions deploy a predictable set of responses, which I have cataloged in the Quantum Snake Oil Dictionary’s companion guide on vendor deflection tactics. The short version:

“The cryptographic establishment is hostile to new ideas”: the NIST process accepted submissions from anyone on Earth. Publication is open. The establishment rewards breaks, not orthodoxy.

“Our approach is too new for NIST”: NIST’s post-quantum process ran from 2017 to 2024. Its additional digital signature call in 2023 accepted new submissions. The Y00 quantum stream cipher literature goes back to the early 2000s. Permutation-based symmetric encryption predates most of these vendors. Newness is not the reason these schemes were not submitted.

“We have patents”: patents attest to novelty, not to security. Many broken ciphers were patented.

“A major bank / government / defense agency is already a customer”: unverifiable, and other buyers’ due-diligence failures are not your due diligence.

What Honest Short-Key Security Looks Like

The entire edifice of modern cryptography operates with short keys. AES-256 encrypts arbitrary-length messages with a 256-bit key. ML-KEM-768 establishes a shared secret with a combined public key and ciphertext of a few kilobytes. None of these schemes claims information-theoretic security. All of them claim computational security: breaking the scheme requires solving a mathematical problem believed to be intractable for any adversary operating within the laws of physics as currently understood.

This is honest, and it is strong. AES-256 has withstood over 25 years of public cryptanalysis by every significant research group in the field. The lattice problems underlying ML-KEM have been studied since the 1990s. These schemes do not invoke Shannon because they do not need to. Their security evidence is empirical (decades of failed attacks) and structural (reduction proofs relating the scheme’s security to well-studied hard problems).

The irony of the Shannon hustle is that schemes claiming the stronger guarantee (information-theoretic security) typically rest on weaker evidence (limited or no independent cryptanalysis, no standardization, no years of adversarial scrutiny). A CISO evaluating two products — one claiming computational security backed by NIST standardization and decades of cryptanalysis, the other claiming Shannon-level security backed by the vendor’s own papers — should find the choice obvious. The stronger claim with weaker evidence is the riskier bet.

For organizations that genuinely need information-theoretic security and can bear the operational cost, the answer is the one-time pad with proper key management, or QKD for key distribution (with all its deployment constraints and trust boundaries that I have covered at length). These solutions are expensive, operationally demanding, and limited in applicability. That is not a market failure. It is a reflection of the mathematical reality Shannon established in 1949: perfect secrecy costs a key as long as the message, truly random, used once. There is no shortcut. Anyone selling you one is selling you something other than what the name implies.

The Five-Question Summary

For reference, the complete decision framework:

Length Test: Key consumed per operation ≥ plaintext? If not → computational security.
Randomness Test: Key from full-entropy source, not expanded from a short seed? If not → computational security.
Reuse Test: Each key used exactly once and destroyed? If not → computational security.
Theorem Test: Which published theorem, in which model, with which assumptions, provides the guarantee? If “internal analysis” → unverified.
Cryptanalysis Test: Who outside the vendor tried to break it, and where are the results? If nobody → untested.

A scheme that answers “no” to any of the first three questions provides computational security. That is not a condemnation; it is a classification. Evaluate the scheme on its computational merits: key length, algorithm design, implementation quality, published cryptanalysis, standardization status. A name borrowed from a theorem is not evidence that the theorem applies.