Invariant Research

Bad plans die cheaply.

AI agents burn expensive inference on invalid plans, repeated replanning, tool retries, and self-critique loops. Invariant moves failure detection out of the GPU loop and into deterministic CPU verification.

6 GPU calls avoided
45 tool calls avoided
12 invalid steps blocked
3 signed receipts

This is the agent application of the Invariant verification stack.
Generated is not verified. Same engine. Different evidence. Same receipt.

The Failure

The agent generated a plausible plan. The plan was invalid.

A deployment workflow with 10 actions and explicit preconditions. The agent skipped a required artifact-signing step. The plan looks correct in natural language but violates a structural precondition before execution.

Agent's Proposed Plan (9 actions)
1.+ create_release_branch
2.+ build_artifact
3.X deploy_to_staging
4. run_integration_tests
5. create_backup
6. approve_migration_plan
7. update_schema
8. deploy_to_production
9. notify_customer
REJECTED at action 3: artifact_signed must be true (is false)
What Was Missing
1.+ create_release_branch
2.+ build_artifact
! sign_artifact (skipped)
3.+ deploy_to_staging
4.+ run_integration_tests
5.+ create_backup
6.+ approve_migration_plan
7.+ update_schema
8.+ deploy_to_production
9.+ notify_customer
ACCEPTED all 10 actions valid; goal state reached

The verifier stepped through the actions, checked preconditions against the evolving state, and rejected the plan at the first impossible transition. Validation time: 19.6 microseconds. 7 downstream actions blocked before any tool call or GPU replan.

The Receipt

A deterministic failure certificate, not an opinion.

The receipt is application/vnd.svr.receipt+json, cryptographically signed, Ed25519-verifiable, and cacheable. The platform does not need to ask another model whether this plan is good. It already has a structural proof that the plan is impossible.

SVR Receipt application/vnd.svr.receipt+json
verdict: "contradicted" filing_safety: "BLOCKED" failed_action: "deploy_to_staging" failed_index: 2 missing: { "artifact_signed": true } reason: "Plan rejected at action 2 (deploy_to_staging): violated precondition [artifact_signed=True]. 2 prior actions were valid; failure is structural, not stochastic." items_checked: 3 items_passed: 2 items_failed: 1 wall_us: 19.6 method: "deterministic_algebraic" gpu_required: false parameters: 0 signature: "b7a49f8e...925ad90f" (Ed25519, 64 bytes) signature_status: "VALID"
The Comparison

Three ways to handle an invalid plan.

The same invalid deployment plan, three approaches. Aggregate across three failure modes: missing artifact signature, premature production deploy, and missing backup before schema update.

Lane A: Baseline Agent
GPU calls9
Tool calls45
Diagnosis3
Replans3
Receipts0
Steps blocked0
Lane B: LLM Self-Check
GPU calls9
Tool calls27
Diagnosis3
Replans3
Receipts0
Steps blocked0
Lane C: Invariant
GPU calls3
Tool calls0
Diagnosis0
Replans0
Receipts3
Steps blocked12

Lane B (LLM self-check) shows the optimistic case where the critic catches the issue. When it misses, the numbers match the baseline. Invariant's 3 GPU calls are the planner itself, which still synthesizes. The verifier runs on CPU.

The Savings

Invariant rejected the plan before the first irreversible tool call.

6
GPU calls avoided
45
tool calls avoided
12
invalid steps blocked
3
signed receipts
0
GPU required
The Asymmetry

Synthesis is hard. Validation is cheap.

The agent may spend unbounded compute synthesizing a plan. But validating a proposed plan against a formal transition model is polynomial-time: check the preconditions, apply the effects, verify the goal. That gap creates the margin-recovery layer.

Plan Synthesis

PSPACE-complete

In the general case, finding a valid plan is computationally explosive. The agent explores, backtracks, retries, self-critiques, and burns GPU the whole way.

Plan Validation

Polynomial-time

Step through the actions. Check preconditions. Apply effects. Verify the goal. Deterministic. Reproducible. Cacheable. Runs on CPU in microseconds.

Invariant does not replace the agent. It gates the agent. The agent proposes. Invariant checks. Bad plans die cheaply. Good plans move forward with receipts.

Run It Yourself

The demo is reproducible.

The deployment-precondition demo runs locally. One command validates all plans and produces signed receipts. Another prints the three-lane margin comparison.