commonsformat-verifier
Deployment-tier verification for Commons Format implementations. Takes a candidate implementation, the merged spec it was generated against, the eval results from running the spec's eval suite, and a declared deployment tier — and confirms whether the implementation meets the tier's requirements per format-spec §12.
Premise
The verifier is the gate between "we have a candidate implementation" and "we are ready to deploy this implementation." It composes the outputs of the other tools (eval runner results, lockfile state, generator records) and decides whether the implementation meets the requirements of a declared deployment tier.
The verifier does not parse, run evals, fetch, or generate. Those are the responsibilities of upstream tools. The verifier reads their outputs and decides yes/no per tier.
The deployment tier definitions are in format-spec §12. The verification axes are in format-spec §13. The verifier implements the policy that combines them: which combinations of axis verification satisfy each tier's requirements.
Interface
A verifier is a function or component that takes:
- The merged spec the implementation was generated against (typically referenced by hash from the lockfile)
- A lockfile recording the resolved dependency tree, generator diversity achieved, runtime diversity achieved, and applied eval packs
- Eval run results from the eval runner, indicating which cases passed, failed, or were skipped, per generator and per runtime
- The declared deployment tier (D0, D1, D2, or D3)
And returns:
- A verification result indicating whether the deployment tier's requirements are met, and if not, which specific requirements are unmet
The verifier is the policy authority. Other tools produce facts (parsed content, eval results, fetched checksums); the verifier decides whether those facts meet the tier's requirements.
Schema
This module ships a schema.sql declaring the
shape of its data. Per §8
this is a shape commitment, not a storage commitment — the
generated implementation chooses a runtime representation
appropriate to its intent.
-- This DDL describes data shape, not storage. Runtime representation
-- is implementation-defined; choose what is appropriate to the target
-- language and the module's intent. The verifier is a pure policy
-- evaluator that reads evidence (eval results, lockfile state) and
-- produces a verification decision. The tables below are the shape
-- of that decision and its supporting evidence, not a database.
CREATE TABLE verification_decisions (
spec_hash TEXT NOT NULL,
declared_tier TEXT NOT NULL CHECK (declared_tier IN ('D0', 'D1', 'D2', 'D3')),
outcome TEXT NOT NULL CHECK (outcome IN ('tier-met', 'tier-not-met', 'evidence-incomplete')),
decided_at_ms INTEGER NOT NULL,
format_spec_version TEXT NOT NULL,
PRIMARY KEY (spec_hash, declared_tier)
);
CREATE TABLE unmet_requirements (
spec_hash TEXT NOT NULL,
declared_tier TEXT NOT NULL,
ordinal INTEGER NOT NULL,
requirement_kind TEXT NOT NULL CHECK (requirement_kind IN ('eval-depth', 'generator-diversity', 'runtime-diversity', 'constraint-coverage', 'prose-review', 'formal-artifact')),
requirement_detail TEXT NOT NULL,
PRIMARY KEY (spec_hash, declared_tier, ordinal),
FOREIGN KEY (spec_hash, declared_tier) REFERENCES verification_decisions (spec_hash, declared_tier)
);
CREATE TABLE axis_evidence (
spec_hash TEXT NOT NULL,
axis TEXT NOT NULL CHECK (axis IN ('eval-depth', 'generator-diversity', 'runtime-diversity')),
achieved_value TEXT NOT NULL,
PRIMARY KEY (spec_hash, axis)
);
Tier requirements
The verifier enforces the requirements specified in format-spec §12 for each deployment tier. The summary below is for reference; the format spec is canonical.
D0 (Personal/Experimental): Functional eval cases pass.
D1 (Internal Production): Functional and adversarial eval cases pass. All declared constraints are verified by at least one case.
D2 (Network-Exposed Production): D1 requirements plus: generator-adversary cases pass; multi-generator conformance with at least two independent-family generators; all declared constraints have verifying eval cases.
D3 (Critical Infrastructure): D2 requirements plus: multi-runtime conformance with implementations passing under at least two independent-family runtimes; runtime-subset constraints enforced where declared; formal-method artifacts present where applicable.
The verifier reports "tier met" only when every applicable requirement is satisfied. Partial satisfaction reports as "tier not met" with the specific gaps enumerated.
What the verifier must do
read-tier-requirements-from-format— requirements come from format-spec §12; the verifier does not invent additional requirements or relax declared onescheck-each-axis— each verification axis (eval depth, generator diversity, runtime diversity) is evaluated against the tier's threshold for that axischeck-constraint-coverage— at D2 and higher, every constraint declared in the merged spec must have at least one eval case withverifieslisting itreport-specific-gaps— when verification fails, the result names exactly which requirements are unmet, not just "fails"composable-with-existing-evidence— the verifier reads evidence produced by upstream tools; it does not re-run those toolsdeterministic-policy— the same evidence produces the same verification result across runs and across implementations
What the verifier must NOT do
- Run eval suites. The eval runner does that; the verifier reads its results.
- Fetch dependencies. The fetcher does that.
- Generate implementations. The harness does that.
- Modify lockfiles. The lockfile tool generates them; the verifier reads them.
- Add tier requirements not declared in format-spec §12. The verifier is policy enforcement, not policy authorship.
- Pass verification when requirements are unmet, even partially. Production deployments depend on the verifier being strict.
Threat model
The verifier is the final policy gate. A buggy or compromised verifier that reports "tier met" when requirements are not met is the worst kind of failure — it produces unjustified confidence that masks real risk.
Specific concerns:
- A verifier that accepts incomplete or fabricated eval results. Mitigation: results are signed or content-addressed back to the runner that produced them; the verifier checks provenance.
- A verifier that interprets tier requirements loosely. Mitigation: the verifier's own eval suite tests strictness — given evidence that should fail a tier, the verifier must report failure.
- A verifier that misses a tier requirement because the format spec was updated and the verifier wasn't. Mitigation: the verifier records which format version it was generated against; format updates require regenerating verifiers.
- An attacker submitting a different lockfile than the consumer intended. Mitigation: the lockfile is content-addressed in the consumer's project; substitution requires bypassing the consumer's version control.
The verifier is the most security-critical tool in the chain. It is the right place for multi-generator conformance verification: multiple independently-generated verifiers all reaching the same verification result is meaningfully more trustworthy than a single verifier's word.
Verification
This module's eval suite tests verifier policy enforcement: given known evidence states (passing eval results, failing eval results, specific axis configurations), does the verifier correctly report tier met or not met?
The cases include all tiers and all common gap conditions (missing adversarial pass, insufficient generator diversity, unverified constraints, etc.). A consumer generating a verifier implementation iterates against this eval suite until conformance.