Critique

GitHub-native AI code review
For investors
AI ships more code.
Critique makes the merge feel earned.A GitHub-native review control plane that reads the diff, related files, and test surface before anyone has to guess where the risk lives, then hands accepted fixes to Remedy with the exact evidence attached. Try Critique Chat in the browser with no install needed.
GithubGithub
One review surface, grounded by repo context
Pull requests, check runs, ownership context, and follow-up chat all compound into the same evidence layer.
PRs / Checks / Context
Try Critique ChatConnect GitHub AppSee the fix flow
Merge posture
Review outputs stay grounded in repo evidence, then route to the right specialist only when the diff demands it.
Repository-grounded answers
Shared evidence board
Escalate only when risk appears
Repo-aware
Reads diff, related files, tests, and ownership map.
Parallel specialists
Security, billing, and performance lanes spawn only when needed.
Shared evidence board
Every agent compounds to the same investigation layer.

The browser chat for codebase questions that usually turn into detective work.
Ask the repo what changed before you read the diff.Multi-model, repo-aware chat in the browser for the fast questions, the odd regressions, and the "what actually changed?" moments that block momentum.
Model switching
Route harder questions to a slower model without leaving the thread.
Repo context
Connect GitHub once and keep answers grounded in the real codebase.
Immediate access
Open chat in the browser first, then graduate to deeper review flows as needed.
Open Critique ChatCreate account
StepfunStepFun-3.5 Flashcritique-sh/app
What changed in the payments module since the last 3 commits?
U
Stepfun4 files changed in the last 3 commits:
stripe/webhooks.ts  +31 -4
  Idempotency key check added on L112.
billing/invoice.ts  +14 -18
  Double-charge edge case resolved via
  early-return before ledger write.
types/plans.ts  +2 -0
  ProAnnual enum member introduced.
Streaming
Ask a follow-up...

Why now
Code generation accelerated. Review quality did not.AI can write features, boilerplate, tests, and migrations faster than ever. But merging faster code safely requires a deeper review layer — one that understands architecture, tests, security boundaries, and downstream impact.

Core system
Not a single model pass. A coordinated review system.The app is the control plane — policies, persistence, GitHub output. The sandbox is the execution plane — repo clone, evidence collection, optional sandbox-native final artifact.
01ScoutMaps files, dependencies, tests, call sites, impact zones. Sandbox-backed analysis is attempted first; GitHub API scout is fallback.
02Shared BoardCreates a live evidence and task layer for all agents. The app is the control plane; the sandbox is the execution and evidence plane.
03Specialists in ParallelSecurity, tests, architecture, performance, docs — each with grounded sandbox evidence.
04Review OutputSynthesises findings, removes noise, makes final verdict. Sandbox-native artifact used when available; backend synthesis is the fallback.
05RemedyLive beta: managed execution in E2B sandboxes with guarded pushback. BYOA (external-agent handoff) is roadmap.

Shared Investigation Board
Agents shouldn't review in isolation.Scout turns the PR into a shared investigation space. Tasks, evidence, context, and follow-up questions are posted to a common board so specialists can coordinate, not duplicate work. The app is the control plane — it owns policy, persistence, and GitHub publication; the sandbox execution plane handles repo inspection and can author the final review artifact.
Evidence node
Auth issue task
Test gap task
Dependency impact task
Architecture concern task
Control plane finalizing

Product surfaces
One review system.
Multiple engineering surfaces.From security and tests to architecture, performance, and autonomous fix — each surface is a clear module you can rely on.
CR_ChatLive
Critique ChatTry multi-model, repo-aware chat in the product — no install required to start. Connect GitHub when you want live code context and usage limits that match your plan.
Open Chat
CR_Security
Security ReviewCatches auth bypasses, permission gaps, secret exposure, unsafe data access, and boundary regressions.
CR_Testing
Test Coverage ReviewFinds missing tests, weakened assertions, untested pricing or billing paths, and regression risk.
CR_Architecture
Architecture ReviewFlags layering violations, hidden dependency drift, incorrect abstractions, and risky structural changes.
CR_Performance
Performance ReviewDetects N+1 queries, repeated fetch patterns, wasteful loops, and scalability concerns.
CR_Remedy
RemedyTurns findings into code changes, runs verification, and pushes fixes automatically.
CR_BYOA
Bring Your Own AgentSend Critique's structured fix blueprint to Codex, Claude Code, Copilot, or any external agent workflow.

How it gets made
A simple animated line
from repo to review.
Scout
Scout takes the repoIt reads the diff, nearby files, and tests first, then decides what actually deserves deeper review.
18 files changed43 related reads11 tool passes

Sample output
A review UI that actually shows
how the system thinks, searches, and decides.This example walks through the full flow: Scout reads the repo, specialists are created with named models, tool usage is exposed, and the final review stays technical enough to be acted on instead of being dismissed as generic AI commentary.
Review session
github.com/acme/workspace-platform
FAIL
Active change
Unify workspace invites with seat billing reconciliation
PR #842 · feat/invite-seat-reconciliation · 2m 41s
Changed files
18
API, billing, workers, tests
Related files read
43
call sites, migrations, ownership, tests
Agents created
6
Scout, 4 specialists, Lead Reviewer
Tool passes
11
search, policies, graph, tests, schema
Tool usage
Semantic grep
Auth guards, workspace authority, invite handlers
Branch graph
Call sites and import edges around billing mutations
Policy replay
Route-level auth and tenant boundary checks
Test impact
Specs touched by invite, seat, and webhook changes
Schema diff
Migration and index changes in invite tables
Patch handoff
Structured fix plan for Remedy or external agents
Investigation log
00:04
Scout expands the PR beyond the diff
Mapped 18 changed files to 43 adjacent files, including invite routes, billing actions, workers, and failure-prone tests.
00:07
Security specialist spawned
Route authority shifted from URL params to request body, so the auth lane was escalated immediately.
00:09
Billing specialist spawned
Seat reconciliation helper stopped carrying an idempotency token through Stripe mutation retries.
00:14
Performance specialist spawned
Context graph builder introduced sequential reads in the hottest review path.
00:19
Lead Reviewer began synthesis
Consensus reached after 14 evidence notes and 3 corroborated findings landed on the shared board.
Critique review board
Unify workspace invites with seat billing reconciliation
github.com/acme/workspace-platform · PR #842 · verdict FAIL · confidence 96%
Live boardNamed modelsTechnical reviewFix export
Live agent activity
This is the real-time surface: Scout reads the repo, creates specialists only where the diff demands it, records tool usage, and waits for evidence before the lead model writes the review.
Shared investigation board
Scout has already framed the review problem before specialists open files.
Reads the diff first, then builds the context envelope needed for specialist review.
ClaudeClaude Sonnet 4.6
Evidence seeded
43 related files · 14 evidence notes seeded
Spawned 00:07
Security
OpenAI
Tenant boundaries
Examines route authority, policy helpers, and request validation in the invite flow.
policy replayroute traceschema validation diff
1 critical finding · 3 route guard mismatches
Spawned 00:09
Billing
Google
Seat sync
Compares the new seat reconciliation action against webhook retries and proration semantics.
mutation tracewebhook difftest coverage scan
1 high finding · 2 missing retry guards
Spawned 00:14
Performance
DeepSeek
Context graph latency
Measures review-path query serialization and cost spikes in the context graph builder.
trace timingsquery mapimport graph
1 medium finding · 54 reads serialized
Lead synthesis
Lead Reviewer starts only after evidence converges.
OpenAIGPT-5.4
Final synthesis blocks merge because the tenant boundary regression is externally reachable, the Stripe mutation can apply twice on retry, and the context graph builder now serializes 54 reads in the hot path.
Tool bursts
Semantic grep
Auth guards, workspace authority, invite handlers
Branch graph
Call sites and import edges around billing mutations
Policy replay
Route-level auth and tenant boundary checks
Test impact
Specs touched by invite, seat, and webhook changes
Schema diff
Migration and index changes in invite tables
Patch handoff
Structured fix plan for Remedy or external agents
Findings snapshot
Criticalcorroborated
Route authority regressed from URL-bound workspace scope to request-body scope
Highcorroborated
Seat reconciliation mutation can apply twice when the server action retries
Mediumcorroborated
Context graph builder serializes expensive reads in the hottest review path

Remedy
Critique finds the issue.
Remedy proves the fix.Instead of stopping at review comments, the platform hands accepted findings to Remedy with suspect files, the invariants that must hold, and the verification bundle required before any branch gets touched.
Critical auth scope
Critique flags that workspace authority can shift from the route param to request-body input.
High billing retry risk
The seat mutation can apply twice if the server action retries before local state catches up.
Remedy verification
Remedy patches the code, runs lint and tests, and only then pushes the verified fix package.
Execution path
1Critique isolates the exact lines tied to the finding.
2Remedy writes the patch in an isolated execution environment.
3Verification runs before any pushback hits the branch.
How Remedy works · Credit economics
Before / after compare
Drag across the patch Critique blocked, then see the Remedy version that restores tenant scope and idempotency.
Drag to compare
Blocked by Critique
invites.ts
app/api/workspaces/[workspaceSlug]/invites.ts
41
const targetWorkspaceId = input.workspaceId ?? params.workspaceSlug
// Body value silently overrides the route scope
42
const membership = await requireWorkspaceMember(session.user.id, input.workspaceId)
// Auth check follows caller-supplied workspace id
43
await listPendingInvites(targetWorkspaceId)
// Invite reads happen before route scope is re-bound
···unchanged
67
await stripe.subscriptionItems.update(itemId, { quantity: nextSeatCount })
// No idempotency fence before the external mutation
68
await db.workspace.update({ data: { lastAppliedSeatCount: nextSeatCount } })
// Persistence only happens after the Stripe write
Critique review · FAILTypeScript · UTF-8 · LF
Patched by Remedy
invites.ts
app/api/workspaces/[workspaceSlug]/invites.ts
41
const workspace = await requireManagedWorkspace(session.user.id, params.workspaceSlug)
// Authority stays bound to the route slug
42
const operationKey = `workspace:${workspace.id}:seats:${revision}`
// Stable key ties retries to one billing revision
43
await createSeatOperationFence(tx, operationKey)
// Fence is persisted before the side effect
···unchanged
67
await stripe.subscriptionItems.update(itemId, { quantity: nextSeatCount }, { idempotencyKey: operationKey })
// Retry-safe external mutation
68
await Promise.all([markOperationApplied(operationKey), runVerificationBundle()])
// Verification and persistence complete together
Remedy verified · PASSTypeScript · UTF-8 · LF
Patched · 52%
Blocked · 48%

Why teams choose Critique
Not all AI review is the same.The difference is not whether a model can comment. It is whether the system understands the repo, coordinates evidence, and gives engineers something they can trust enough to merge or act on.
Typical AI PR reviewer—Reads the diff only
—Single model output
—Limited architectural context
—Can comment, but not execute
—Weak cost control
—Little policy flexibility
Critique✓Repository-aware scouting
✓Parallel specialist agents
✓Shared evidence coordination
✓Final lead reasoning layer
✓Autonomous fix or BYOA execution
✓Flexible model routing and credit control

Policy & controls
Strict where it matters. Flexible where it doesn't.·Require deeper review on auth or billing code
·Escalate security agents on protected directories
·Tune strictness by repo or branch
·Choose lead model and specialist stack (Standard & Pro: same catalog)
·Route routine PRs to lighter models; save frontier models for when it matters
·Ultra: GPT-5.2 Pro, GPT-5.4 Pro, Claude Opus 4.6, and any lead as a sub-agent

Economics
Credits follow the work — not a single flat rate.Specialist sub-agents handle narrow inspection tasks; the lead model synthesises the verdict. Standard and Pro ship the same selectable catalog — Ultra adds GPT-5.2 Pro, GPT-5.4 Pro, and Claude Opus 4.6 — so you scale with credits instead of surprise routing.

Built for
Built for teams shipping in the age of generated code.AI-heavy product teamsReview the flood of generated code with more than a single diff pass.
Engineering leadsCatch structural regressions, missing tests, and security drift before merge.
Startups moving fastAdd deep review without building internal agent infrastructure.
Teams with existing coding agentsKeep Codex, Claude Code, or Copilot for execution and let Critique own review quality.

Powered by the best models
Standard and Pro share one catalog; Ultra adds GPT-5.2 Pro, GPT-5.4 Pro, and Claude Opus 4.6 — or let Critique route for you.OpenAI
Claude
Qwen
Z.ai

System credibility
Built to be inspected, not merely trusted.
Repository-native, sandboxed by default, policy-aware, and explicit about the evidence behind every output.
GitHub-nativeRuns attach to pull requests, check runs, and repository context instead of forcing another review surface.
Isolated sandboxesEach execution path stays contained, so investigation and remedy work happen away from your local machine.
Policy controlsRouting, tool access, and operational constraints stay explicit instead of being buried in a black-box prompt.
Structured evidenceFindings, artifacts, and review traces stay inspectable, linkable, and easy to verify after the run.

Plans
Simple, transparent
pricing.
Standard
$12/mo
— 500 credits / month
— Full lead & specialist catalog (same as Pro)
— GitHub check runs
— Dashboard access
Get started
Pro
$35/mo
— 2,000 credits / month
— Same model catalog as Standard
— Fix proposal agent
— 7-day free trial
Start free trial
Ultra
$129/mo
— 10,000 credits / month
— GPT-5.2 Pro, GPT-5.4 Pro, Claude Opus 4.6 (Ultra-only)
— Same lead ↔ sub flexibility as Standard / Pro, plus frontier models
— Org-wide tooling
— Priority support
Sign up
Open Source Student Deal
Same entry credit pool as Standard, routed through a curated low-cost lane (e.g. GLM-5, Kimi K2.6, MiniMax M2.7) and efficient sub-agents — not the full public catalog. Starts at $5/mo, or $2.50/mo on the annual plan.
See student offer
View full pricing →

Move from AI output to merge-ready confidence.
Start reviewing before code ships.Critique Chat is live today. GitHub App review runs, sandbox-backed analysis, and GitHub publication are already in the product. Managed Remedy is live in beta for guarded fix execution and PR pushback.
Try Critique ChatOpen the beta
Start in the browser with repo-aware chat.
Connect GitHub when you want review runs.
Escalate to remedy flows when a fix should be proposed.
Read the docs →

AI ships more code.Critique makes the merge feel earned.

Ask the repo what changed before you read the diff.

Code generation accelerated. Review quality did not.

Not a single model pass. A coordinated review system.

Agents shouldn't review in isolation.

One review system.Multiple engineering surfaces.

Critique Chat

Security Review

Test Coverage Review

Architecture Review

Performance Review

Remedy

Bring Your Own Agent

A simple animated linefrom repo to review.

Scout takes the repo

A review UI that actually showshow the system thinks, searches, and decides.

Critique finds the issue.Remedy proves the fix.

Not all AI review is the same.

Typical AI PR reviewer

Strict where it matters. Flexible where it doesn't.

Credits follow the work — not a single flat rate.

Built for teams shipping in the age of generated code.

AI-heavy product teams

Engineering leads

Startups moving fast

Teams with existing coding agents

Standard and Pro share one catalog; Ultra adds GPT-5.2 Pro, GPT-5.4 Pro, and Claude Opus 4.6 — or let Critique route for you.

Built to be inspected, not merely trusted.

Simple, transparentpricing.

Start reviewing before code ships.

AI ships more code.
Critique makes the merge feel earned.

One review system.
Multiple engineering surfaces.

A simple animated line
from repo to review.

A review UI that actually shows
how the system thinks, searches, and decides.

Critique finds the issue.
Remedy proves the fix.

Simple, transparent
pricing.