Change Control
How Critique v4 governs the merge boundary with Change Passports, agent risk scoring, evidence contracts, merge policy as code, verified repair, and a learning loop.
Critique v4 is an AI Change Control Platform. Instead of only leaving review comments, it sits at the merge boundary and records a durable system of record for every pull request: where the change came from, how risky it is, what evidence backs a blocking decision, whether policy allows the merge, whether a repair was verified, and what past incidents taught the rules.
Reviews still run. When policy allows, Critique executes AI-powered sandbox reviews (and other review paths) and stores the result as an evidence run linked to the passport. v4 does not remove review — it subordinates review to control: who gets a full pass, why it runs, what must not merge, and what happened last time.
Deep dive: Critique v4 launch essay. Marketing gate overview: critique.sh/checkpoint. Operator surfaces: Dashboard → Passports and Dashboard → Control Board.
Not a comment bot
v4 is built to be the best AI management layer for code change, not a precursor to single-model PR commentators. The passport is the product; the thread is not.
Check names are stable
The pre-merge gate is presented as the Agent Firewall in the UI, but it still publishes as the Critique / Checkpoint GitHub check. Merge policy publishes as Critique / Merge Policy. Existing branch-protection rules keep working without changes.
The control loop
flowchart LR
Agent[Agent or human opens PR] --> Passport[Change Passport opens]
Passport --> Risk[Provenance + risk score]
Risk --> Gate[Agent Firewall gate]
Gate --> Review[Evidence-backed review]
Review --> Policy[Merge policy: dry-run to enforce]
Policy --> Repair[Verified repair if needed]
Repair --> Merge[Merge]
Merge --> Memory[Incident learnings feed back into rules]
Memory --> Riskv3 vs v4 (quick orientation)
| v3 | v4 | |
|---|---|---|
| Center of gravity | Review run → findings → PR comment | Change Passport per PR |
| Dashboard home | Review runs inbox | Passports queue |
| Operator question | What did the model say? | May this change merge? |
| Reviews | The product | Still run — evidence runs on the passport |
Change Passports
A Change Passport is the PR-level record. The queue at /dashboard/passports lists every governed pull request with repo, author, a source badge, gate outcome, risk, verdict, evidence status, merge policy, proof, and memory. Filter by repo, risk, state, and verdict.
The detail view at /dashboard/[owner]/[repo]/pulls/[number] renders the full passport:
| Section | What it shows |
|---|---|
| Summary | State, head SHA, latest snapshot |
| Provenance | Source kind, vendor, confidence, requested by |
| Risk | Band, score, and reasons |
| Gate | Agent Firewall events and check runs |
| Evidence runs | Commit-level reviews on the PR |
| Merge permission | Latest policy outcome, conclusion, overrides |
| Remedy proof | Verified-repair bundles |
| Memory | Finding occurrences and active suppressions |
| Incidents | Linked provider events and learnings |
| Timeline | A single chronological feed across all of the above |
When no immutable snapshot exists yet, the passport infers provenance live from PR signals and labels it as heuristic so the record is never blank while the first review runs.
Agent risk scoring
Every review run persists a risk score, a band (low, medium, high, critical), and human-readable reasons. Risk flows into the passport queue, the passport detail, and the control-room overview, so it is a first-class column you can filter and sort, not a sentence buried in review prose.
Evidence Contract
The Evidence Contract normalizes review artifacts into a consistent shape and exposes:
- the blocking decision for the run, and
- the evidence linked to each finding.
The rule is simple: if a change is blocked, the block must point at something. Legacy review artifacts are normalized by the same accessors, so older runs still render under the contract. Drill into the evidence from any review-run page.
Evidence runs (AI-powered sandbox review)
An evidence run is a commit-level review linked from the passport. Critique still performs deep review when gate and review policy allow — typically via managed sandbox execution, model routing, scout plus specialist lanes, and a lead verdict. The UI label changed from “the product” to “evidence on the passport,” but the engineering work of reviewing code did not stop.
| Operator intent | Where it lives in v4 |
|---|---|
| Should we spend review on this PR? | Gate + risk on the passport |
| What did review find? | Evidence run → Evidence Contract |
| May it merge? | Merge policy on the passport |
| Who opened it? | Provenance on the passport |
Use Dashboard → Review runs for commit-level drill-down; use Passports for PR-level control.
Merge policy as code
Merge policy is a schema, an evaluator, and the published Critique / Merge Policy check. Configure it in the dashboard (Control Board → Change policy) or in a repo file.
Modes
| Mode | Behavior |
|---|---|
DRY_RUN | Evaluate and record the decision without affecting the check conclusion |
WARN | Surface a neutral or warning conclusion without failing the gate |
ENFORCE | Fail the check when policy is not satisfied |
Repo-file policy
Critique reads policy from .critique/policy.yml, .critique/policy.yaml, or .critique/policy.json when present. Choose repo_file as the source on the Control Board to let the repository own its policy.
Operator overrides
An operator can override a policy decision from the passport. The override records provenance (who, when, and why) and can patch the GitHub check-run status so branch protection reflects the call you made. Override errors (for example, a failed check-run update) are surfaced on the passport.
Verified repair (Remedy proof)
When Remedy fixes a finding, v4 stores a proof bundle on the attempt:
- a patch hash,
- validation results,
- the push or export mode, and
- a verification linkage back to the review run that confirmed the fix.
A repair is recorded as done because the proof says so, not because an agent claims it is.
Findings memory and incident learnings
- Findings memory surfaces suppressions and a feedback ledger on the Control Board, with revoke and expire actions.
- Incident feedback ingests events from Sentry, Linear, Jira, Vercel, and generic or manual sources, links them to the passport, and drafts learnings you can promote into control rules or dismiss.
This closes the loop: what breaks in production teaches the gate what to catch next time.
The Control Board
The Control Board at /dashboard/control is one operator surface with five tabs:
| Tab | Purpose |
|---|---|
| Agent Firewall | Source, path, dependency, workflow, auth, validation, and secret-handling rules |
| Change policy | Merge policy status and inline merge-permission controls |
| Delivery | Webhook health, replay, sync, provider status, passport backfill coverage |
| Memory | Findings memory, suppressions, and feedback |
| Learnings | The incident learning queue and actions |
Legacy /dashboard/change-gate and /dashboard/checkpoint redirect into the Control Board gate. The legacy automation editor is preserved directly for anyone who still needs it.
Relationship to the rest of the platform
| Layer | Question it answers |
|---|---|
| Agent Firewall | Should we spend review effort on this change at all? |
| Review policy | How hard should we judge findings? See Policy fields. |
| Merge policy | Is this change allowed to merge? |
| Remedy proof | Was the repair actually verified? |
| Memory and learnings | What should the rules catch next time? |