Published on Data Blog

Strengthening governance and trust in AI-based data dissemination with Proof-Carrying Numbers

This page in:
Strengthening governance and trust in AI-based data dissemination with Proof-Carrying Numbers

As artificial intelligence (AI) becomes a new gateway to development data, a quiet but significant risk has emerged. Large language models (LLMs) can now summarize reports, answer data queries, and interpret indicators in seconds. But while these tools promise convenience, they also raise a fundamental question: How can we ensure that the numbers they produce remain true to the official data they claim to represent?

 

AI access does not equal data integrity

Many AI systems today use retrieval-augmented generation (RAG), a technique that feeds models with content from trusted sources or databases. While it is widely viewed as a safeguard against hallucinations, it does not eliminate them. Even when an AI model retrieves the correct data, it may still generate outputs that deviate from it. It might round numbers to sound natural, merge disaggregated values, or restate statistics in ways that subtly alter their meaning. These deviations often go unnoticed because the AI still appears confident and precise to the end user.

Developers often measure such errors through evaluation experiments (or “evals”), reporting aggregate accuracy rates. But those averages mean little to a policymaker, journalist, or citizen interacting with an AI tool. What matters is not whether the model is usually correct, but whether the specific number it just produced is faithful to the official data. 

 

Where Proof-Carrying Numbers come in

Proof-Carrying Numbers (PCN), a novel trust protocol developed by the AI for Data - Data for AI team, addresses this gap. It introduces a mechanism for verifying numerical faithfulness — that is, how closely the AI’s numbers match the trusted data they are based on — in real time.

Here’s how it works:

  • The data passed to the LLM must include a claim identifier and a policy that defines acceptable behavior (e.g., exact match required, rounding allowed, etc.). 

  • The model is instructed to follow the PCN protocol when generating numbers based on that data.

  • Each numeric output is checked against the reference data on which it was conditioned.

  • If the result satisfies the policy, PCN marks it as verified [✓].

  • If it deviates, PCN flags it for review [⚠️]. 

  • Any numbers produced without explicit marks are assumed unverified and should be treated cautiously.

This is a fail-closed mechanism, a built-in safeguard that errs on the side of caution. When faithfulness cannot be proven, PCN does not assume correctness; instead, it makes that failure visible. This feature changes how users interact with AI: instead of trusting responses blindly, they can immediately see whether a number aligns with official data.

 

Figure 1. Implementation of the Proof-Carrying Numbers (PCN) protocol in an MCP-based AI chatbot. Numbers generated by the LLM from the provided data are displayed with a check mark when verified.

Image

 

A simplified demonstration of the PCN protocol is available in this Google Colab notebook, showing how numerical verification is applied step-by-step in an AI-assisted data query. This example illustrates how the model outputs are checked against their source data, and how verification marks are generated when numbers are proven faithful.

 

How PCN differs from citation-based solutions

Many AI systems try to build trust through citations or links to sources. While citations increase transparency, they cannot guarantee that numeric values are accurate. Models can still misstate or reinterpret numbers, even while citing legitimate sources.

A citation simply tells users where the information supposedly came from; it does not prove that the cited source actually contains the same number or that the model has used it correctly. In practice, models can (and often do) cite legitimate sources while still presenting incorrect or reinterpreted numbers, creating a false sense of credibility. Users may assume a response is accurate because it “looks” referenced, even when the number itself is fabricated or distorted.

The PCN protocol goes beyond citation by performing a numerical verification. Every value generated by the model is checked against the underlying data, and a verification mark or flag is produced accordingly. In other words, citations show provenance, while PCN enforces faithfulness. 

 

Why this matters for official statistics and development data

Official statistics underpin development policy. These numbers are curated with rigorous standards, metadata, and institutional accountability.

However, when AI becomes the interface to those numbers, a governance gap emerges. AI systems operate probabilistically, not according to the statistical principles that govern official data. They may sound authoritative, but they can quietly break the chain of trust that defines official data.

PCN helps close that gap by embedding governance directly into AI-mediated dissemination: 

  • Protects integrity: Ensures no number appears official unless it matches the authoritative dataset.

  • Improves transparency: Clearly signals to users when deviations occur.

  • Strengthens accountability: Makes verification explicit rather than implicit.

Together, these features introduce a policy enforcement layer for numeric fidelity, keeping AI-generated values anchored to official data sources. 

 

 

From trust by design to governance by proof

In traditional data systems, metadata helps users interpret statistics. It describes what a number means, how it was produced, and under what conditions it should be used. However, metadata is descriptive — it guides interpretation but cannot enforce correctness.

The PCN protocol introduces something new: verification instead of description. It doesn’t replace metadata or statistical standards; it complements them by automatically checking whether an AI system’s numerical outputs remain faithful to the reference data.

This marks a shift from trust by design to governance by proof. Rather than assuming systems built with good intentions will behave correctly, PCN verifies that they do — every time a number is generated.

It also exposes the limits of prompt engineering. Even when developers instruct a model to “only use the data provided,” compliance is never guaranteed. Prompts are guidance, not enforcement. Models can still reinterpret or omit values in ways that are imperceptible to users — including when told to follow the PCN format itself.

PCN closes this gap by adding a verification layer that operates independently of the model’s behavior. When the model outputs a number, PCN checks it against both the underlying data and the policy defining what “faithful” means — whether an exact match or a small tolerance.

If the output passes, it is verified; if it diverges, it is flagged. When the model fails to follow the protocol, the system treats it as an error by construction. Because PCN uses explicit verification logic, such cases are not uncertain — they are detectable and governed failures, ensuring that the system never depends on the model behaving correctly. By extending governance into the AI layer, PCN helps preserve statistical integrity even as dissemination becomes automated. Where metadata supports human interpretation, PCN ensures machine verification.

 

A foundation for responsible AI in data systems

As development institutions and national statistical offices explore AI for data dissemination, PCN offers a concrete way to strengthen governance. It makes data systems not only AI-enabled but also AI-accountable.

In practical terms, future data portals, chat interfaces, and digital assistants can display numbers that are probably aligned with official data or clearly marked when they are not. Users remain informed. Institutions remain authoritative.

The result is a model of responsible AI use that preserves the credibility of official statistics while expanding their reach through intelligent systems.

 

Looking ahead

AI-assisted access to development data is both inevitable and desirable. It can democratize information, reduce technical barriers, and make complex data more usable. But innovation without safeguards risks undermining the very trust that makes official statistics valuable.

PCN provides that safeguard. By coupling AI generation with real-time verification, it ensures that every number an AI produces carries not just context but proof — proof that it faithfully represents the data it cites, and proof that governance still applies, even in the age of AI.

In short, PCN is not just a technical innovation but a governance innovation that helps preserve the integrity of official statistics as AI becomes a primary channel for data access.


Aivin Solatorio

Program Manager, Development Data Group, World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000