Critique v5.1: We Hit Go, Walked Away for Five Hours, and Shipped While Reviewing Ourselves
v5.1.0 tightens the loop between coding agents and merge-grade review: transparent OpenCode decisions, credit previews, Builder-to-passport handoff, marketplace feedback you can see, and a repo-first Platform inbox — mostly built by autonomous agents with almost no human steering.

Critique v5.1
Autonomous ship · self-review · operator-first
critique.sh
v5.1.0 · 5 June 2026
We hit go. Five hours later, Critique had reviewed itself into v5.1.
Transparent OpenCode controllers. One Coding Agent across Builder and API with credit preview before you spend. Marketplace attribution on every skill. Repo-first inbox and Platform connections beta. Mostly built by autonomous agents — the same harness that reviews your PRs.
What we mean by “Critique built Critique”
Picture the boring version first, because the boring version is true. A cloud agent branch opened. It read the v5.1 spec from internal ship notes and the existing v5.0 surfaces. It implemented review-controller persistence, Builder credit previews, API idempotency, marketplace attribution UI, repo-home defaults, and connections copy. It ran unit tests. It opened a pull request.
Then Critique reviewed that pull request the same way it reviews yours: sandbox OpenCode, structured findings, passport linkage, merge policy context. A human did not line-edit TypeScript at 2 a.m. They pressed go, said run Remedy on the failures that mattered, and let the critique-review skill argue with the agent until the diff was boring enough to merge. The loop closed. Review infrastructure shipped on infrastructure that had just been reviewed by itself.
Review runs that explain themselves
PR Review v5 made the diff win the first screen. v5.1 answers the next question operators actually ask: why did the agent do that? OpenCode is no longer a black box that either returns JSON or times out. The cheap controller now persists liveness decisions (wait, nudge, abort) and artifact decisions (accept, follow up) with an evidence-quality score and explicit missing-requirement lists.
When merge is blocked, the review-run detail page does not dump twelve equal-weight buttons. It shows what blocked you across verdict, passport, merge policy, and evidence quality, then offers one primary path: Start Remedy, Queue BYOA, or Dismiss with feedback. That sounds like UX polish. It is actually policy: merge decisions should not feel like a settings panel.
- Evidence strong / usable / limited — one badge, proof panel when you need depth
- Controller timeline — why Critique waited or followed up before accept
- Repo quick presets — auth/security and payments/data paths with stronger models
- Finding feedback — each action states it updates Findings Memory on the repo
- Yarn audit JSON/NDJSON parsed for high and critical advisories
- Same sandbox collector lane as npm/pnpm — fewer “skipped repo” dead ends
If you lived through the era where every OpenCode failure looked like a hard 500, this release will feel almost insultingly calm. Recoverable stalls read as stalls. Thin evidence reads as thin evidence. The product stops training you to debug the harness when you meant to judge the pull request.
One Coding Agent product, two entry points
The most common v5.0 confusion was philosophical: “Is Builder different from the API?” Yes and no. Same `builderJob`, different front door. v5.1 makes that impossible to misunderstand.
Composer with model picker, credit preview before submit, draft PR publish, and handoff to passport/review when a PR number exists. You see balance, floor, and projected balance after the minimum charge; submit is blocked before the request if you cannot afford the model.
The same job over HTTP: create with optional `"preview": true`, idempotent retries, cursor lists, durable SSE resume, models discovery, status polls, cancel, safety budgets (`maxTurns`, `maxRuntimeMs`, `maxCredits`), and signed webhooks for idle/completed/failed/cancelled.
Responses now return `builderJobId` plus `links.builder`, `links.status`, `links.stream`, and `links.passport` when a draft PR exists. The public `/coding-agent-api` page says this out loud. Signed-in operators can jump from marketing copy to their latest Builder job without guessing IDs.
From green Builder run to merge-grade review
Coding agents that only ship code are demos. Coding agents that ship code and land in your change-control story are products. When a published Builder job has a PR number, v5.1 surfaces draft PR and Change Passport actions together: open the PR, jump to review for that branch, or walk back to the repository passport queue. The Branch / PR tab links both directions.
That handoff is the v5 thesis in one gesture: generation is cheap; merge evidence is expensive. Critique already had passports and policy from v4. v5.1 makes sure Builder output does not dead-end in a URL you paste into Slack.
Marketplace feedback you can see
v5.0 launched `/skills` with browse, publish, and a performance leaderboard. v5.1 makes the flywheel visible before you publish your third version. Marketplace cards and skill detail pages show attributed review runs, false-positive rate, label counts, and leaderboard readiness. Publishers can trace “this skill was active on run X” to ranked quality instead of install vanity.
Automation catches up: `crt_` keys gain `read:skills`, pinned `SKILL.md` fetch by version, and portable bundles for CI portals. Install panels can mark a marketplace skill as the default PR-review lens per repository; manual dashboard launches attach the slug to review-run metadata for attribution.
Ship log
Outcome bullets for v5.1.0 — what moved for operators, no file-path archaeology.
Browse skills
Official and community critique-review lenses with performance signal on the card.
Coding Agent API
Preview, idempotency, webhooks, and the same builderJob as Workspace Builder.
v5.0 platform essay
Marketplace, merge policy in English, passport exports, Cursor SDK — the v5.0 story.
Persistent sessions
Warm OpenCode between API turns — companion essay from the v5.0 API wave.
Repo home first, Platform when you need the control room
New installs no longer land on a first-run chooser. Repo home is the default. `/dashboard/pull-requests` opens on a cross-repo attention inbox: blocked, failed, warned, running, needs review — without forcing you to pick a repository before you see fire. Rows keep repository context attached; the picker shows webhook and cache health (Healthy, Stale cache, Missed events, Refresh failed, Install warning) where you actually select repos.
The old control room is not gone. It is `Platform` in the nav: passports, policy, delivery, connections, and fleet status when you need the wide lens. Repo-first for daily review work; Platform for the operating picture. Settings still let you switch experiences if your team runs control-room-first.
Connections as a beta platform layer
We stopped describing Connections as “just settings.” v5.1 labels what is live today: scoped `crt_` keys, MCP v0.1.0 with three tools (`list_passports`, `get_review_run`, `queue_review`), REST v1 for passports and review runs, the Coding Agent API, native Linear and Slack, and Zapier-first positioning for Jira, Sentry, Vercel, and everything else in the long tail.
Three MCP tools is thin for the v5 story. We know. Expanding MCP is the obvious P0 — `queue_remedy`, `get_passport`, `list_review_runs`, `compile_merge_policy_draft` — because MCP is how Cursor and Claude distributions actually install habits. v5.1 ships honest beta framing instead of pretending the catalog is finished.
We really do not stop shipping
Diff-first runs, multi-turn OpenCode, quieter default streams.
Marketplace, NL merge policy, passport exports, Coding Agent API warm sessions, repo-first dashboard, Insights.
Transparent controllers, one Coding Agent product, credit previews, marketplace attribution, Platform inbox — mostly agent-built.
v5.0 was the “this is a platform” release. v5.1 is the “this is how we operate the platform while agents ship the next slice” release. The semver is honest: not a rewrite, a tightening of mental models and operator paths that v5.0 introduced faster than docs could follow.
- HumanGo
Pick branch, enable cloud agent, set scope to v5.1 ship notes.
- Agents~5 hours
Implement, test, PR, Critique review, Remedy on failures — repeat until mergeable.
- CritiqueSelf-review
Same critique-review skill and passport evidence as customer runs; findings feed marketplace metrics.
- Shipv5.1.0
Changelog on /version, this essay, package.json 5.1.0 — deploy.
What to do Monday morning
If you are an operator: open a recent review run that blocked merge and read the controller timeline. Flip a repo to an auth/security preset and rerun. If you automate coding: call create with `"preview": true` once before you wire CI spend. If you publish skills: check whether your lens shows attributed runs yet; install defaults per repo if you want leaderboard credit on manual launches.
If you are evaluating whether Critique is “real”: watch us ship v5.1 mostly agent-driven, then read the same findings we show you on your PRs. The ship log at `/version` is the operator contract. This essay is the why.
Read the ship log
Every bullet on /version is written for operators and finance — active voice, no private repo links, no implementation dump.
Open /version