Post-Quantum, PQC, Quantum Security Quantum Networks

Data Discovery & Classification: Foundations for Quantum Readiness

Marin Ivezic June 28, 2024

16 minutes read

Introduction

Every CISO knows the old adage: “You can’t protect what you don’t know you have.” In the quantum readiness era, this doesn’t just apply to hardware and software assets – it applies equally to data. As organizations brace for cryptography-breaking quantum computers, they must first discover and classify their data to understand what’s at stake. Data is now scattered across on-prem servers, cloud buckets, employee devices, SaaS apps, and more. This sprawl obscures the overall data picture, making it difficult for security leaders to identify and prioritize threats. In fact, a recent CISO survey found 73% of CISOs struggle to manage data scattered across various platforms and cloud environments, leading to blind spots in visibility, control, and compliance. Simply put, “you cannot secure what you cannot see”.

The Quantum Threat Puts Data in the Crosshairs

The urgency around post-quantum cryptography (PQC) is clear: data encrypted today could be compromised tomorrow once sufficiently powerful quantum computers emerge. Adversaries are already executing so-called “harvest now, decrypt later” attacks – intercepting and storing encrypted data today in hopes of decrypting it in the future when quantum capabilities mature. This means the risk to long-lived sensitive data is immediate. Even if a quantum breakthrough is years away, any confidential information with a multi-year lifespan (medical records, trade secrets, state secrets, etc.) is at risk now if protected by vulnerable algorithms. In other words, short-term encryption can lead to long-term exposure.

Not all data carries equal risk in this scenario. Some information is essentially ephemeral – its sensitivity decays quickly over time. Other data needs to remain confidential for decades. Quantum threats force security teams to ask: Which data, if decrypted in 5-10 years, would be catastrophic? Prioritizing those datasets for quantum-safe protection is key. Effective data classification provides the roadmap for this prioritization.

The Data Discovery Challenge in Modern Environments

Before an organization can classify and prioritize its crown-jewel data, it faces a fundamental hurdle: finding where all that data resides. Modern enterprises generate and hoard data at an astounding scale. Customer PII might lurk in a legacy database; intellectual property documents might sprawl across employee SharePoint sites; sensitive emails traverse multiple cloud services. Data discovery – the process of mapping out what data you have and where it lives – has become a monumental challenge.

Several factors make data discovery difficult today:

Data Sprawl Across Environments: Organizations now operate in hybrid IT environments (on-premises, multiple clouds, SaaS, mobile devices). Sensitive data is dispersed across multiple locations, from core data centers to employees’ personal cloud drives. This dispersion “obscures the overall data picture” for CISOs.
Unstructured and Shadow Data: Beyond well-structured databases, companies have vast stores of unstructured data (documents, images, emails) that may contain sensitive information but aren’t neatly indexed. “Dark data” and shadow IT systems (set up outside official IT purview) often contain sensitive records unknown to central security teams.
Dynamic Data Generation: New data is created and copied constantly – through customer transactions, IoT sensors, application logs, backups, etc. Keeping an up-to-date inventory is a moving target as data flows and grows continuously.
Access Silos: Different business units or cloud platforms might each have their own data repositories and tooling. Security teams struggle to gain clear visibility into all silos at once, hampering their ability to enforce uniform controls. Without this visibility, “enforcing access controls and security policies becomes challenging, increasing the risk of unauthorized access and data breaches”.
Regulatory Pressure: Regulations like GDPR, HIPAA, and others require knowing where personal or sensitive data is stored and who accesses it. Yet data proliferation across environments makes it difficult to locate, classify, and manage sensitive information effectively, putting compliance at risk.

In short, discovering sensitive data is as foundational to security as asset inventory. Just as you cannot patch an untracked server, you cannot encrypt or protect data you haven’t identified. Modern data discovery tools are rising to the challenge by using automation, AI, and machine learning to scan environments and pinpoint sensitive information. For example, new Data Security Posture Management (DSPM) solutions can automatically crawl through structured and unstructured stores to identify and tag sensitive data across diverse sources (on-prem and cloud). These tools leverage pattern matching and AI (for context) to find things like PII, financial records, or secret keys lurking in large data sets. The goal is to give CISOs a holistic view of their data landscape – answering “where is my sensitive data and who has it?” – so they can focus protection efforts on what matters most.

Data Classification: Knowing What Really Matters

Finding the data is step one. Step two is understanding its importance. This is where data classification comes in. Data classification means organizing data into categories (like Public, Internal, Confidential, Highly Sensitive, etc.) based on its sensitivity, value, or regulatory requirements. The aim is to align security efforts with the true risk level of the data. In the context of quantum readiness, classification lets us identify which data would have the biggest impact if decrypted by an adversary in the future.

Leading security frameworks emphasize data classification. For instance, CIS Critical Security Control 3: Data Protection calls for developing processes to identify, classify, securely handle, retain, and dispose of data. Organizations often align data classification schemes with standards like NIST or ISO 27001 and compliance mandates (GDPR, HIPAA, etc.). By using well-defined tiers of sensitivity, you ensure consistency and can demonstrate due diligence to auditors.

A simple but effective classification model might define three tiers of data sensitivity :

Tier 1 – Highly Sensitive: This includes your most critical data – personally identifiable information (PII), financial records, health data, trade secrets and intellectual property (IP), classified government info, legal documents, etc. Compromise of Tier 1 data would be highly damaging. These assets demand the strongest protections and should be top priority for quantum-safe encryption.
Tier 2 – Moderately Sensitive: Important but not mission-critical data – for example, internal emails, project documents, operational records, customer communications. Unauthorized exposure might be harmful but likely has lower immediate impact or shorter lifespan. These should be protected according to best practices, but they’re not the first in line for post-quantum upgrades.
Tier 3 – Low Sensitivity: Public or widely available information, or non-sensitive internal data. This might include public web content, marketing materials, etc. Such data may not need post-quantum protection in the near term (if it’s public, encryption is often moot).

This kind of categorization helps focus your security investments. The goal is to align your security investments with the value, lifespan, and vulnerability of the data. In practice, that means Tier 1 data gets priority for the most robust controls (encryption, monitoring, access restrictions), Tier 2 gets standard controls, and Tier 3 might get minimal controls.

Consider Data Lifespan and Long-Term Confidentiality

An often overlooked dimension of classification is data lifespan. For quantum risk management, it’s not just how sensitive data is now – it’s how long it needs to stay confidential. We must ask: Will this information still be sensitive and potentially exploitable 5, 10, 20 years from now? This is crucial due to the quantum threat timeline. If data will remain sensitive over a long period, then even a far-off quantum computer could jeopardize it (hence the harvest-now-decrypt-later risk).

When classifying data, organizations should tag long-term vs. short-term sensitivity :

Long-Lived Data (high retention): Records that must remain secure for many years – e.g. national security files, legal records, compliance archives, medical records stored for decades. These should be prioritized for quantum-safe protection because they are prime targets for an adversary to steal now and decrypt later. As NIST’s guidance suggests, any data with a confidentiality requirement beyond the expected arrival of quantum decryption capabilities should be considered at immediate risk.
Short-Lived Data: Information that is only sensitive briefly – e.g. one-time session keys, temporary transaction logs, ephemeral messages. If such data loses its sensitivity within days or weeks, the quantum risk is lower (though not zero – an attacker might still use it if they can break it quickly). Generally, short-lived data is lower priority for PQC migration, unless it’s connected to more sensitive systems or could serve as a stepping stone for attackers.

This concept is encapsulated by Mosca’s rule (or Mosca’s inequality) in PQC planning: If the sum of X (the number of years data must remain secure) plus Y (the years it will take to fully transition to quantum-safe cryptography) exceeds Z (the years before quantum attackers emerge), then that data is already at risk now and needs protection immediately. For example, if you have data that must stay secret for 10+ years and you expect it’ll take ~5 years to roll out PQC, and if some predict quantum breaking could happen in ~10 years, you’re effectively out of time – you should start protecting that data right away. This is why experts warn that every extra year of procrastination is another year adversaries can siphon off encrypted sensitive data for future decryption.

Prioritizing Cryptography Mitigation Using Data Classification

By discovering and classifying your data, you build a risk-based map that directly informs which cryptographic protections to upgrade first. Instead of blindly trying to replace every algorithm everywhere (an impossible “big bang” approach), you can take a phased PQC migration focusing on the highest-impact areas. Not all cryptographic dependencies need urgent updates – but knowing what matters most is critical. So how do we connect data to cryptography mitigation? Consider the following approach:

Map Sensitive Data to Its Protecting Cryptography

For each data category (especially Tier 1 data), identify how that data is protected today. Where is it stored, and is it encrypted at rest (e.g. in a database or file system)? How does it travel, and is it protected in transit (e.g. via TLS or a VPN)? What encryption algorithms, keys, and certificates are involved in those processes?

This step often requires a cryptographic inventory in parallel to data discovery – finding all instances of encryption, digital signatures, and keys in the environment. A NIST project on PQC migration emphasizes using cryptographic discovery tools to learn where and how cryptography is being used to protect the confidentiality and integrity of your organization’s important data.

In practice, organizations must enumerate all sensitive data assets and the crypto that safeguards them, identifying “which protocols are used, where keys are deployed, and what kind of data they protect.” This linkage is crucial: it’s no use deploying a quantum-safe algorithm if it’s not actually guarding critical data, and conversely, critical data left under legacy encryption represents a glaring risk.

Assess Quantum Vulnerability of Current Crypto

Once you know, for example, that a certain database holds highly sensitive Tier 1 data and is protected by RSA-based encryption, you can assess that as a high-risk cryptographic exposure (because RSA will be broken by quantum attacks). Another database might hold only Tier 3 public data – its use of RSA is far less urgent to fix.

This assessment should account for both the algorithm’s vulnerability and the data’s sensitivity/retention. (If that database with public data uses RSA, it might technically be vulnerable to quantum cryptanalysis, but the impact of compromise is negligible, so it’s a low priority.)

On the other hand, any system using quantum-vulnerable crypto to protect long-term confidential data is a critical concern. For example, if your corporate VPN or TLS gateways are using classical RSA/ECDH to protect all network traffic (including sensitive data flows), those are prime candidates for early adoption of PQC or hybrid encryption.

As guidance from standards bodies suggests, tackle high-risk use cases first – systems where a future compromise would be catastrophic, or data that needs to remain confidential for a decade or more. Those should be at the top of your PQC migration list.

Prioritize & Roadmap Mitigations

With a clear picture of data and crypto risk, you can now set priorities. Many organizations find it useful to create a matrix of cryptographic dependencies vs. data criticality. For example, you might highlight:

Tier 1 data protected by RSA/ECC – highest priority (e.g. customer PII in an encrypted database, VPN tunnels carrying sensitive traffic, code-signing systems ensuring integrity of critical software, etc.).
Tier 1 data protected by symmetric crypto (AES-256) – medium priority (symmetric algorithms like AES are not broken by known quantum algorithms, but key exchange or key management might rely on vulnerable public-key crypto).
Tier 2 data under RSA/ECC – medium priority (less critical data but vulnerable algorithms; schedule these after Tier 1).
Tier 3 data under RSA/ECC – lower priority (vulnerable crypto but low impact data).
Any systems using deprecated or weak crypto (even classical weaknesses like old SSL/TLS versions) – flag for near-term fix as well, since those are immediate risks even before quantum. In fact, pilot cryptographic discovery often reveals hidden legacy crypto that should be fixed regardless of quantum impact.

The output of this exercise is a PQC migration roadmap that sequences upgrades in a logical order. This risk-based sequencing is echoed by many experts. Concretely, this could mean deploying hybrid post-quantum solutions within the next year or two for things like your VPN tunnels and PKI certificates, ensuring that if any data is intercepted today, it’s protected by at least one quantum-safe layer. Lower-risk systems can follow later – eventually everything must be upgraded, but this phased approach buys down the most serious risks early.

Implement Controls and Monitor

As you execute the roadmap, ensure you have mechanisms to track progress and coverage. Data classification should be an ongoing process – new data stores or new types of sensitive information might emerge, requiring updates to your inventory and priorities. It’s wise to integrate data sensitivity labels into your data management and security monitoring tools. For instance, if “Tier 1” data is tagged in your databases or DLP tools, you can set alerts if that data is found unencrypted or leaving the network. Also, maintain your cryptographic inventory: what algorithms and key lengths are in use where. This helps ensure no system is left behind. The NCCoE’s migration project notes that a robust cryptography inventory supports risk management by showing where to implement PQC first and tracking progress.

By following these steps, an organization can transition from a nebulous fear of the quantum threat to a concrete action plan. Data discovery and classification shine a light on where quantum-vulnerable crypto really matters, so you can allocate budget, talent, and technology upgrades to those areas first.

Tools and Solutions: Bridging Data Discovery with Quantum Readiness

Given the complexity, many enterprises are seeking tooling to automate and support these processes:

Data Discovery & Classification Tools: A number of commercial solutions (often under the umbrella of DSPM or data governance) help automatically find and label sensitive data across your IT estate. For example, tools like BigID, Informatica, Varonis, and others use pattern matching and AI to scan data repositories for things like PII, financial info, or secrets. These can greatly accelerate building your data inventory. They effectively tackle the “you can’t secure what you can’t see” problem by providing comprehensive data visibility across structured and unstructured sources. Crucially, they also enable tagging data by sensitivity or criticality, which feeds into your classification scheme. Some platforms even integrate with security orchestration – e.g., once data is tagged as “Highly Sensitive”, encryption or access control policies can be automatically applied. This automation is valuable for maintaining strong data governance at scale.
Cryptographic Inventory & Management Tools: On the cryptography side, new tools are emerging to inventory and manage cryptographic assets enterprise-wide. Such tools can identify where outdated algorithms like RSA-1024 or SHA-1 are used, flag non-compliant certificates, and even facilitate swapping in new PQC algorithms. We are seeing a convergence where data security and cryptography management tools collaborate – e.g., integrations where a data discovery tool finds a sensitive dataset and a crypto management tool ensures it’s encrypted with approved (quantum-safe) algorithms.
Risk Assessment Frameworks: Professional services and consultancies have begun offering “Quantum Risk Assessments” or PQC readiness assessments that essentially perform the above steps (discovery, classification, prioritization) as a service. These typically include workshops to identify critical data and processes, automated scans for cryptographic usage, and expert guidance on mitigation roadmaps. For example, one PQC readiness assessment outlines: Step 1, Discovery of Cryptographic Assets; Step 2, Classification of Cryptographic Risks (by importance of the systems/data involved); Step 3, Gap Analysis; Step 4, Prioritization & Roadmap; Step 5, Crypto-agility planning. This mirrors the approach we’ve discussed, highlighting that classification of risks (which essentially means understanding which cryptographic assets protect critical vs. non-critical data) is indispensable for a sensible migration plan.

The good news is that data discovery and classification aren’t “nice-to-have” extras for quantum readiness – they’re core parts of it. By investing in these capabilities, you’re simultaneously improving your overall security posture (better data governance, compliance, breach prevention) and preparing for the specific challenges of PQC.

A side benefit of going through this process is that you often uncover current weaknesses: unencrypted sensitive files that should be encrypted even with classical algorithms, or legacy crypto implementations that pose an immediate risk. One post-quantum security analysis noted that piloting cryptographic discovery often “reveals hidden legacy crypto that should be fixed even in the present (e.g. outdated SSL versions lurking on a device)”. So, quantum readiness efforts can drive remediation of today’s problems too – a win-win for security.

Beyond Classification: Maintaining a Quantum-Ready Data Posture

Once you’ve identified your high-value data and begun protecting it with quantum-resistant cryptography, the work isn’t over. Quantum readiness is an ongoing posture, not a one-time project. Here are a few best practices to maintain a quantum-ready stance with respect to your data:

Regularly Re-Evaluate Data Sensitivity: Business is dynamic – new types of sensitive data may emerge (for example, if your company launches a biomedical research project, suddenly you have IP and health data where you didn’t before). Regularly update your data inventory and classification. Keep an eye on data creep – information can become more sensitive over time (e.g. accumulating large volumes of seemingly innocuous data can itself become a sensitive asset).
Practice Data Minimization & Retention Policies: One simple way to reduce risk is to store less sensitive data, for less time. Many organizations are guilty of keeping data forever “just in case”. This prolongs the exposure window for quantum threats. Implement strong data retention limits: if you no longer need certain records, securely delete them rather than archiving indefinitely. As one set of best practices advises, “regularly audit stored data and securely delete information that is no longer needed. Reducing the amount of long-retained sensitive data directly reduces your attack surface.”. In essence, the less data you have to protect for the long haul, the easier quantum mitigation becomes.
Periodic Re-Encryption of Archives: For data that does need to be kept for years, establish a lifecycle of cryptographic upkeep. This means periodically re-encrypting long-term data stores with updated algorithms and keys. Don’t encrypt something once and forget about it for 20 years. For example, if you encrypted a trove of archives with RSA-2048 in 2015, you should re-encrypt them with stronger algorithms (ideally PQC/hybrid) well before a quantum computer arrives. Ongoing key rotation and re-encryption ensure that even if old ciphertext was stolen, it becomes useless once you’ve re-encrypted it with quantum-safe methods.
Monitor Data Movement and Exposure: The risk to data is not just about how it’s stored, but also how it’s accessed and transmitted. Data that is highly exposed (e.g. accessible via the internet or frequently in transit across networks) is at higher risk of interception. “APIs, public-facing apps, cloud services – any external systems – should be top priority for quantum-safe encryption because they’re first points of entry for attackers”. Even internal traffic and stored data should eventually be covered, but start with the most exposed channels. Continuously monitor where your sensitive data is flowing. If, for instance, Tier 1 data suddenly shows up in a new cloud app, ensure that app uses approved quantum-safe encryption for data in transit and at rest.
Build Crypto-Agility: Finally, recognize that post-quantum cryptography itself will evolve. We may see new PQC algorithms emerge, or unforeseen weaknesses in ones that are standardized. The journey doesn’t end in 2030; it’s an ongoing evolution. Therefore, design your systems with crypto-agility – the ability to swap out cryptographic components with minimal disruption. If you’ve done data classification properly, you know where the most critical encryption lies; ensure those systems (and indeed all systems) can flexibly accommodate algorithm changes. This means avoiding hard-coded algorithms, using centrally managed crypto libraries, and embracing hybrid cryptographic architectures during the transition. Hybrid solutions (combining classical and PQC algorithms) can provide backward compatibility and resilience as we make this shift. Many organizations choose hybrid encryption for an interim period – for example, using classical TLS and post-quantum key exchange in tandem – to safeguard communications without betting entirely on new tech from day one.

Conclusion

Achieving quantum readiness is often described as the largest and most complex overhaul in cybersecurity history. It’s a bit like upgrading the engine of an airplane while in flight – you have to replace cryptographic foundations across countless systems without breaking everything. In this daunting effort, data discovery and classification provide a crucial compass. They ensure you’re steering your quantum migration toward what truly needs protection, rather than flying blind. By focusing on the sensitivity, value, and lifespan of your data, you make informed decisions about what to protect first.

In summary, to prioritize which cryptography to mitigate in the face of quantum threats:

Discover your data. You can’t encrypt or migrate what you haven’t found. Invest in discovering all sensitive data across IT and OT environments (just as you invest in discovering all assets).
Classify its importance. Not all data is equal. Determine which information would cause the greatest damage if decrypted by an adversary in the future – whether due to sensitivity or required longevity of secrecy.
Map to cryptography. Link your critical data to the cryptographic schemes protecting it today. These are your high-priority cryptography upgrade targets.
Mitigate in phases. Start transitioning those high-priority areas to quantum-safe (or hybrid) encryption as soon as possible. Lower-risk areas follow, but remember that “eventually, everything must transition” – crypto-agility will be your safety net to continuously adapt.
Maintain vigilance. Keep the cycle going – data inventories, classification, and crypto upgrades are not one-and-done. Make it part of regular cybersecurity governance to review data sensitivity and encryption posture, especially as new PQC standards and tools emerge.

Preparing for the quantum era is a race against time, but it’s a race that can be won with the right strategy. By illuminating where your most sensitive data lies and how it’s protected, data discovery and classification let you allocate your efforts where they matter most. In the end, quantum readiness isn’t just about deploying new algorithms – it’s about protecting your organization’s critical information against both current and future threats. And you can only do that if you truly know your data.

Quantum Upside & Quantum Risk - Handled

My company - Applied Quantum - helps governments, enterprises, and investors prepare for both the upside and the risk of quantum technologies. We deliver concise board and investor briefings; demystify quantum computing, sensing, and communications; craft national and corporate strategies to capture advantage; and turn plans into delivery. We help you mitigate the quantum risk by executing crypto‑inventory, crypto‑agility implementation, PQC migration, and broader defenses against the quantum threat. We run vendor due diligence, proof‑of‑value pilots, standards and policy alignment, workforce training, and procurement support, then oversee implementation across your organization. Contact me if you want help.

Talk to me Contact Applied Quantum