Skip to content
Critique/docs
Platform

Change Control

How Critique v4 governs the merge boundary with Change Passports, agent risk scoring, evidence contracts, merge policy as code, verified repair, and a learning loop.

Critique v4 is an AI Change Control Platform. Instead of only leaving review comments, it sits at the merge boundary and records a durable system of record for every pull request: where the change came from, how risky it is, what evidence backs a blocking decision, whether policy allows the merge, whether a repair was verified, and what past incidents taught the rules.

Reviews still run. When policy allows, Critique executes AI-powered sandbox reviews (and other review paths) and stores the result as an evidence run linked to the passport. v4 does not remove review — it subordinates review to control: who gets a full pass, why it runs, what must not merge, and what happened last time.

Deep dive: Critique v4 launch essay. Marketing gate overview: critique.sh/checkpoint. Operator surfaces: Dashboard → Passports and Dashboard → Control Board.

Not a comment bot

v4 is built to be the best AI management layer for code change, not a precursor to single-model PR commentators. The passport is the product; the thread is not.

Check names are stable

The pre-merge gate is presented as the Agent Firewall in the UI, but it still publishes as the Critique / Checkpoint GitHub check. Merge policy publishes as Critique / Merge Policy. Existing branch-protection rules keep working without changes.

The control loop

flowchart LR
  Agent[Agent or human opens PR] --> Passport[Change Passport opens]
  Passport --> Risk[Provenance + risk score]
  Risk --> Gate[Agent Firewall gate]
  Gate --> Review[Evidence-backed review]
  Review --> Policy[Merge policy: dry-run to enforce]
  Policy --> Repair[Verified repair if needed]
  Repair --> Merge[Merge]
  Merge --> Memory[Incident learnings feed back into rules]
  Memory --> Risk

v3 vs v4 (quick orientation)

v3v4
Center of gravityReview run → findings → PR commentChange Passport per PR
Dashboard homeReview runs inboxPassports queue
Operator questionWhat did the model say?May this change merge?
ReviewsThe productStill run — evidence runs on the passport

Change Passports

A Change Passport is the PR-level record. The queue at /dashboard/passports lists every governed pull request with repo, author, a source badge, gate outcome, risk, verdict, evidence status, merge policy, proof, and memory. Filter by repo, risk, state, and verdict.

The detail view at /dashboard/[owner]/[repo]/pulls/[number] renders the full passport:

SectionWhat it shows
SummaryState, head SHA, latest snapshot
ProvenanceSource kind, vendor, confidence, requested by
RiskBand, score, and reasons
GateAgent Firewall events and check runs
Evidence runsCommit-level reviews on the PR
Merge permissionLatest policy outcome, conclusion, overrides
Remedy proofVerified-repair bundles
MemoryFinding occurrences and active suppressions
IncidentsLinked provider events and learnings
TimelineA single chronological feed across all of the above

When no immutable snapshot exists yet, the passport infers provenance live from PR signals and labels it as heuristic so the record is never blank while the first review runs.

Agent risk scoring

Every review run persists a risk score, a band (low, medium, high, critical), and human-readable reasons. Risk flows into the passport queue, the passport detail, and the control-room overview, so it is a first-class column you can filter and sort, not a sentence buried in review prose.

Evidence Contract

The Evidence Contract normalizes review artifacts into a consistent shape and exposes:

  • the blocking decision for the run, and
  • the evidence linked to each finding.

The rule is simple: if a change is blocked, the block must point at something. Legacy review artifacts are normalized by the same accessors, so older runs still render under the contract. Drill into the evidence from any review-run page.

Evidence runs (AI-powered sandbox review)

An evidence run is a commit-level review linked from the passport. Critique still performs deep review when gate and review policy allow — typically via managed sandbox execution, model routing, scout plus specialist lanes, and a lead verdict. The UI label changed from “the product” to “evidence on the passport,” but the engineering work of reviewing code did not stop.

Operator intentWhere it lives in v4
Should we spend review on this PR?Gate + risk on the passport
What did review find?Evidence run → Evidence Contract
May it merge?Merge policy on the passport
Who opened it?Provenance on the passport

Use Dashboard → Review runs for commit-level drill-down; use Passports for PR-level control.

Merge policy as code

Merge policy is a schema, an evaluator, and the published Critique / Merge Policy check. Configure it in the dashboard (Control Board → Change policy) or in a repo file.

Modes

ModeBehavior
DRY_RUNEvaluate and record the decision without affecting the check conclusion
WARNSurface a neutral or warning conclusion without failing the gate
ENFORCEFail the check when policy is not satisfied

Repo-file policy

Critique reads policy from .critique/policy.yml, .critique/policy.yaml, or .critique/policy.json when present. Choose repo_file as the source on the Control Board to let the repository own its policy.

Operator overrides

An operator can override a policy decision from the passport. The override records provenance (who, when, and why) and can patch the GitHub check-run status so branch protection reflects the call you made. Override errors (for example, a failed check-run update) are surfaced on the passport.

Verified repair (Remedy proof)

When Remedy fixes a finding, v4 stores a proof bundle on the attempt:

  • a patch hash,
  • validation results,
  • the push or export mode, and
  • a verification linkage back to the review run that confirmed the fix.

A repair is recorded as done because the proof says so, not because an agent claims it is.

Findings memory and incident learnings

  • Findings memory surfaces suppressions and a feedback ledger on the Control Board, with revoke and expire actions.
  • Incident feedback ingests events from Sentry, Linear, Jira, Vercel, and generic or manual sources, links them to the passport, and drafts learnings you can promote into control rules or dismiss.

This closes the loop: what breaks in production teaches the gate what to catch next time.

The Control Board

The Control Board at /dashboard/control is one operator surface with five tabs:

TabPurpose
Agent FirewallSource, path, dependency, workflow, auth, validation, and secret-handling rules
Change policyMerge policy status and inline merge-permission controls
DeliveryWebhook health, replay, sync, provider status, passport backfill coverage
MemoryFindings memory, suppressions, and feedback
LearningsThe incident learning queue and actions

Legacy /dashboard/change-gate and /dashboard/checkpoint redirect into the Control Board gate. The legacy automation editor is preserved directly for anyone who still needs it.

Relationship to the rest of the platform

LayerQuestion it answers
Agent FirewallShould we spend review effort on this change at all?
Review policyHow hard should we judge findings? See Policy fields.
Merge policyIs this change allowed to merge?
Remedy proofWas the repair actually verified?
Memory and learningsWhat should the rules catch next time?