Specialists

The review pipeline uses one evidence stage, four specialist lanes that run in parallel, and one synthesis stage that produces the GitHub-facing verdict.

Evidence → Security | Tests | Architecture | Performance → Synthesis

Scout-style evidence gathering and lead synthesis are structural stages — they are not toggled off in policy. You can disable individual specialists via Policy fields.

Evidence gathering

Before specialists run, Critique assembles an evidence pack from the pull request:

Changed files and patches
Nearby source and test files when needed
Repository guidance documents the team maintains
Risk hints (for example auth, billing, API, automation paths)
Linked issues referenced in the PR title or body

This pack is shared context so each specialist does not re-fetch the world independently.

Looks for regressions in trust boundaries: authentication and authorization changes, injection surfaces, unsafe dynamic execution, secret handling, and high-risk areas changed without corresponding test movement.

Typical outputs are findings with severity WARNING or FAIL when the change increases exploit or data-exposure risk.

Tests

Checks whether behavior changes are backed by tests: missing coverage on touched logic, weak or misleading tests, and sensitive paths edited without test updates.

Often surfaces FAIL when critical paths change with no test signal and policy strictness treats that as blocking.

Architecture

Focuses on structure and boundaries: client/server mistakes, deep coupling, brittle imports, config contract drift, and module layout smells that will hurt the next change.

Findings are usually WARNING unless a change clearly breaks runtime boundaries (for example server-only modules pulled into client bundles).

Performance

Looks for obvious performance regressions: un-awaited async work, expensive client fetch patterns, serial awaits in hot paths, and similar issues visible from the diff.

Many performance findings are informational or warnings unless policy elevates them.

Synthesis

The lead stage:

Deduplicates overlapping findings from multiple specialists
Normalizes severity and locations for GitHub annotations
Derives the verdict (PASS, WARN, FAIL) from finding severities and your strictness setting

The verdict logic is deterministic given findings and policy — the model polishes summary text but does not arbitrarily change blocking rules.

Finding shape

Each finding includes at minimum:

Title and summary — human-readable explanation
Severity — INFO, WARNING, or FAIL
Confidence — how sure Critique is
Location — file and line when mappable to the diff

Path escalations

Even if you disable a specialist globally, path escalations can force specialists back on when changed paths match sensitive patterns (auth, billing, etc.). See Policy fields.

Model selection

Default models per role are chosen for quality and cost balance on the hosted product. Installations and repositories may override lead and specialist models within plan limits.

Critique does not publish internal prompt text or heuristic pattern lists on this site — those are implementation details that change frequently.

Specialists

On this page