The engine
The engine is generic: it knows no physics directly — every assertion comes from a card. Two complementary checks run against the corpus.
Hypothesis cross-check
The shipping verification surface (MCP tool hypothesis_crosscheck) takes a
proposed new principle — a HypothesisCard, by id or inline — and verifies it
against the corpus and universal priors:
- dimensional analysis of the proposed relation — comparing the declared
dimension vectors, or, when the card supplies a formula (
expr) plus per-symbol dimensions (symbols), deriving the dimensions from the formula itself and checking them againstlhsDims, - reference-corpus resolution — does it resolve against known cards,
- declared limit / conservation claims,
derivedFromlink resolution.
It refuses to fabricate: an unknown id returns a structured error listing the valid ids, not a guess.
Validation envelopes (USCE)
principle cards may declare validationEnvelopes — { key: [min, max] }
numerical bounds — plus expectedLimits and conventions. These are the
assertions the evaluation engine asserts against a candidate output at runtime.
The richer a card’s envelopes and limits, the more deeply the engine can check
code that claims to implement it.
Severity
A check does not collapse to one number — each reports a severity, and the output takes the worst:
CheckSeverity | Meaning |
|---|---|
NONE | The assertion holds. |
LOW | A soft concern. |
MEDIUM | A real problem. |
HIGH | A hard violation — zeroes the overall score even on a functional pass. |
That last row is the core thesis: in science, physical correctness gates
functional correctness. Passing the tests is not enough if a HIGH-severity
physical check fails.
Status
The cards corpus, the MCP server, and hypothesis_crosscheck are usable today.
Both engines are usable today: hypothesis_crosscheck for proposed principles,
and usce_check / run_usce_checks for the validation-envelope check on
finished outputs. USCE’s deeper time-series checks (causality, asymptotic decay)
are on the roadmap.