commonsformat-bootstrap
The end-to-end Commons Format bootstrap process. Describes how a consumer goes from no toolchain to a working D1 Commons Format installation using only an LLM, a Git clone of the canonical modules, and this module's prose as a guide. Embodied as a prompt that a typical generator CLI tool can execute.
Premise
This module describes how a consumer goes from no toolchain to a working D1 Commons Format installation. It is the one document a person encountering Commons Format for the first time needs to read.
The bootstrap depends on three things being available to the consumer:
- A clone of the canonical Commons Format modules (this one plus the eight others it depends on)
- A capable code generator (Claude, GPT, or equivalent) accessible via a CLI or API
- The consumer's local environment with their preferred target language toolchain installed
Given these, the bootstrap process produces a working Commons Format installation that can resolve, fetch, generate, and verify any Commons Format module the consumer wants to use.
The bootstrap process is itself a Commons Format module, embodying its own thesis: even the act of installing Commons Format is described as intent + evals, not as shipped code. The "implementation" of this module is the consumer running through the procedure with an LLM.
Interface
The bootstrap process takes:
- A code generator the consumer can invoke (assumed capable of reading prose and producing source code in the target language)
- A target language identifier (the language the toolchain will be generated in)
- A clone of the canonical Commons Format modules
- The consumer's environment (filesystem, package manager, whatever the target language requires)
And produces:
- A working Commons Format toolchain in the target language, consisting of generated implementations of: parser, eval-runner, resolver, fetcher, lockfile tool, orchestrator, and verifier
- A lockfile recording which generators produced which implementations and which commits of the spec modules were used
The procedure is sequential and deterministic in structure (the order of steps is fixed), though stochastic in detail (each generation step depends on the generator's output, which varies).
The bootstrap procedure
The bootstrap proceeds in stages, each producing artifacts the next stage depends on. The stages are:
Stage 0 — Read the format. The consumer
reads the format-spec module's commonsformat.md
directly. No tooling exists yet; eyes are the parser.
The consumer also reads this module to understand the
journey ahead.
Stage 1 — Generate the bootstrap parser. The consumer prompts the generator with the format-spec prose plus the bootstrap parser spec module's prose, asking for a parser implementation in the target language. The result is iteratively verified against the bootstrap parser's eval suite — the consumer runs the cases by hand or with a small custom test harness in the target language. Iteration continues until conformance.
The bootstrap parser is D0. It does not need to be elegant or fast. It needs to read the bootstrap-relevant spec modules correctly.
Stage 2 — Generate the production parser. Using the bootstrap parser to read the production parser spec module, the consumer prompts the generator to produce a production parser. The production parser is iteratively verified against the merged eval suite of (format-spec + parser).
The bootstrap parser is now redundant for new development but remains as the trusted seed.
Stage 3 — Generate the eval runner. The production parser reads the eval runner spec module. The consumer prompts the generator to produce an eval runner. The eval runner is verified manually at first (running known cases against known implementations), then automatically once the runner can run its own eval suite.
After this stage, all verification is mechanical. The consumer no longer needs to inspect outputs by hand.
Stage 4 — Generate the resolver, fetcher, and lockfile tool. Each is generated and verified using the now-mechanical pipeline. Order matters: fetcher and resolver before lockfile, since lockfile reads their outputs.
Stage 5 — Generate the orchestrator. This step is recursive in a delightful way: the orchestrator is itself generated by a generator. The generated orchestrator can then be used for subsequent generations, including regenerating itself with refined prompts under its own iteration logic. From this point forward, generation-and-verification cycles are mechanically owned by the orchestrator rather than driven by hand.
Stage 6 — Generate the lockfile tool's lockfile. Using the resolver, fetcher, and lockfile tool generated in Stage 4, the consumer records the bootstrap state: which modules were resolved at which commits, which generator produced each implementation, which evals passed at which thresholds.
Stage 7 — Generate the verifier. With the lockfile available, the consumer generates the verifier. The verifier reads the lockfile and confirms whether the deployment tier requirements are met for each generated tool. This is the closing stage of the bootstrap chain.
Stage 8 — Verify D1. The verifier examines the lockfile and the eval results. If all production tools pass their respective eval suites at the D1 thresholds, the toolchain is D1-conformant.
The consumer now has a working Commons Format installation.
Recommended bootstrap target language
The format itself is language-agnostic. Consumers may bootstrap a toolchain in any language they prefer. However, choices vary substantially in how easy the bootstrap is, and the recommendation below exists so that a fresh consumer doesn't pick the hardest path by accident.
Go (1.21+) — recommended. Go's standard
library covers the operations bootstrap tools need: HTTP
fetching, TOML parsing through a well-known library,
SHA-256 hashing, JSON serialization, file system
operations, and Git interaction through
os/exec. Go produces single-binary tools that
ship cleanly without an interpreter or runtime dependency.
From a security architecture perspective, Go is
closer to "a production tool you deploy" than an
interpreter-hosted tool would be; the bootstrap toolchain
is a long-lived artifact in a consumer's environment and
benefits from this property.
Python (3.11+) — supported. Python has
stdlib coverage for HTTP and SHA, with tomllib
providing TOML parsing in 3.11+. Bootstrap in Python is
fast and well-understood. Disadvantages: requires a Python
interpreter at deployment time, slower than Go, and the
interpreter itself is a substantial trusted-base addition
that the security architecture argues against for
production tooling. Fine for experimental or learning
bootstrap; less ideal as the long-term toolchain.
Rust — supported but harder. Rust's standard library lacks HTTP and TOML, which means bootstrap implementations either depend on third-party crates (acceptable but adds dependencies to the trusted base) or implement these protocols manually (substantial work). The format does not preclude Rust, but a fresh consumer choosing Rust as their first bootstrap should expect more work than Go or Python.
Other languages. Any language with HTTP, TOML, SHA-256, and file I/O can host a Commons Format toolchain. The format makes no language-specific assumptions. Consumers comfortable with their chosen language should pick it.
The recommendation is just guidance for the first-time bootstrap. A consumer who has already bootstrapped in one language can use that toolchain to bootstrap a second toolchain in any other language, with the second bootstrap being mechanical because the first toolchain handles parsing, resolution, and verification.
The CLI prompt
The bootstrap procedure can be embodied as a prompt that a typical generator CLI tool executes. The prompt below is suitable for Claude Code, the OpenAI CLI, or any equivalent tool that can read files, write files, and invoke a generator iteratively.
You are bootstrapping Commons Format. Follow this procedure exactly:
1. Read /commonsformat-format/commonsformat.md. This is the format
specification. You will refer back to it constantly.
2. Read /commonsformat-bootstrap/commonsformat.md. This is the procedure
you are executing now.
3. For each module below, in order:
commonsformat-parser-bootstrap
commonsformat-parser
commonsformat-eval-runner
commonsformat-resolver
commonsformat-fetcher
commonsformat-lockfile
commonsformat-orchestrator
commonsformat-verifier
Do the following:
a. Read the module's commonsformat.md.
b. If the module declares depends_on, read each dependency's
commonsformat.md and merge the contracts (intent, constraints,
interface, evals).
c. Generate an implementation in <target language> that
satisfies the merged contract.
d. Generate a test harness in <target language> that runs the
merged eval suite against the implementation.
e. Run the test harness. If any cases fail, examine the
failures, refine the implementation, and rerun. Iterate
until all cases pass.
f. Save the implementation and test results to disk.
4. Once all modules have conformant implementations:
a. Use the resolver, fetcher, and lockfile tool to generate
a lockfile recording the bootstrap state.
b. Use the verifier to check D1 compliance.
c. Report the verification result.
5. If all modules pass D1 verification, the bootstrap is complete.
Output the lockfile path and the toolchain location.
The target language is: <target language>
The generator to use is: <generator identifier>
A consumer running this prompt with a capable CLI tool gets a working Commons Format installation in 15 to 60 minutes depending on generator speed and iteration count. The prompt is reproducible: the same generator on the same modules will produce conformant toolchains, even though specific generated code will differ.
What this module does NOT do
- Provide reference implementations. The bootstrap is described, not shipped as code. Each consumer's bootstrap produces a toolchain unique to their target language and chosen generator.
- Mandate a specific CLI tool. The prompt is generic; any generator CLI capable of file I/O and iterative execution can run it.
- Specify the consumer's iteration strategy in detail. Some generators benefit from explicit instruction to fix specific failures; others work better with broad prompts. The bootstrap description is pragmatic; the consumer adapts.
- Verify generators themselves. The format assumes the consumer chose a generator they trust. Verification of generator trustworthiness is out of scope for this module; multi-generator conformance at higher tiers is the format's mitigation.
Threat model
The bootstrap process is the riskiest moment in an Commons Format consumer's relationship with the format. Threats:
- A compromised generator producing malicious tools that pass the eval suites because the eval suites didn't anticipate the malice. Mitigation at bootstrap: run the bootstrap a second time with a different generator and compare results. Mitigation in ongoing operation: multi-generator conformance at D2+ for any production deployment.
- A compromised module clone (the consumer fetched tampered modules from a malicious source). Mitigation: verify module signatures or content hashes against a trusted reference before starting the bootstrap. The format's lockfile mechanism handles ongoing verification but cannot bootstrap itself.
- A consumer's local environment compromised before the bootstrap begins. Out of scope; if the consumer's machine is compromised, no software-level mitigation in this module is sufficient.
- The bootstrap procedure itself being subtly wrong, leading the consumer to produce non-conformant tools they trust. Mitigation: the eval suites at each stage catch non-conformance. A conformant tool that secretly does something extra is the generator-compromise threat above; mitigation is multi-generator conformance.
The bootstrap is D0 by nature — the consumer is operating on trusted seed material in their own environment. Any production deployment downstream of the bootstrap requires its own appropriate-tier verification, separate from the bootstrap itself.
Verification
This module's eval suite tests the end state of a successful bootstrap. It does not test the bootstrap procedure itself; it tests whether the procedure's outputs constitute a working Commons Format installation.
A consumer who has completed the bootstrap can run this module's eval suite as a final acceptance check. If the cases pass, the bootstrap was successful. If they fail, specific gaps are identified.
Tooling that wraps the bootstrap procedure (a CLI command, ultimately) ends by running this eval suite and reports the result to the consumer.