ProductJune 24, 202610 min readRepath Khan

Critique Coding Agent API: How Teams Are Actually Using Cloud Agents Over HTTP

The Coding Agent API is no longer just “start a run and poll.” It is turning into a control plane for CI repair, support-to-fix loops, internal fix bots, and agent-supervisor workflows.

A cloud coding agent API stops being interesting the moment it can only demo “open sandbox, write patch, maybe open PR.” That is table stakes now.

What matters in practice is the control plane around the run. Can a CI job retry safely? Can a platform team tag work by owner and incident? Can a supervisor tell the difference between “the agent is still running,” “the sandbox is warm, send the next instruction,” and “this session died, start a chained fallback”? Can the same organization attach judgment later at the merge boundary instead of treating code generation as the whole system?

That is what this update is about. We did not build another chat surface and call it infrastructure. We tightened the contract around runs so machines can operate the product without scraping Builder semantics out of loose event text.

Documented Coding Agent API endpoints in Platform OpenAPI

Run id reused across warm follow-up turns

Max normalized tags per run

Common workflow shapes this update now serves cleanly

Why the contract had to get sharper

The cloud-agent market has converged on a few obvious ideas. Cursor publishes a run-based Cloud Agents API. Devin exposes organization-scoped sessions, service-user auth, tags, resumable sessions, secrets, knowledge, playbooks, and session insights. OpenAI Codex can start cloud tasks from pull-request context and push fixes back when it has permission. Factory talks openly about non-interactive execution, cloud templates, and agent readiness. The category is no longer proving that remote coding works. It is proving that the surrounding contract is operable.

What a serious cloud coding API needs in 2026

The baseline moved. “POST a prompt” is not the product anymore; the product is what the surrounding system can rely on.

Capability	Why it matters	This Critique update
Retry-safe create	Webhook handlers and CI rerun jobs cannot open duplicate sandboxes by accident.	Idempotency keys already existed; they remain first-class in the create path.
Lifecycle fields	Clients need stable state, not implied state reconstructed from event text.	`lifecycle`, `terminal`, `awaitingFollowUp`, `canFollowUp`, and `nextActions` now ship on run and status responses.
Attribution	Platform teams need to group work by owner, queue, incident, or source system.	`title`, `tags`, and bounded `metadata` now travel with the run.
Warm follow-ups	Multi-step automation should not pay a cold start for every sentence.	Live `idle` sessions still take follow-ups on the same run id, with cleaner expired-session fallback.
Machine contract	Generated clients and internal SDKs need OpenAPI, not only prose docs.	Platform OpenAPI now covers models, runs, status, messages, stream, cancel, webhook, safety, and SSE schemas.

How teams are using cloud agents over API

There are many possible demos. In practice, the useful patterns collapse into a small set of repeatable loops. The Coding Agent API is now shaped around those loops rather than around one-off prompt submission.

The four workflow shapes
1CI repair loop
A failing workflow or flaky check triggers a run with idempotency, run tags like ci or owner:platform, a strict safety budget, and optional draft PR publish. The client watches lifecycle and nextActions instead of reverse-engineering the event ledger.
2Support-to-fix loop
An investigated support or intake packet becomes a coding-agent run with ticket metadata and repository context. The follow-up turn adds the missing regression test or docs note without recloning the repo when the session is still warm.
3Internal fix bot
A platform team’s own orchestrator owns routing, queuing, and permissions, while Critique owns sandbox execution, patch generation, and optional PR publish. Tags and metadata make these runs queryable by queue, tenant, or incident.
4Agent-supervisor workflow
One system writes code, another later judges mergeability. Critique’s writer API stays explicit about repo work, while Review runs, Change Passports, and the Merge Gate API remain the judge layer on the PR that agent created.

The API shape after this update

Create a run with attribution and budgets

The new fields are intentionally boring. That is the point. They give orchestration systems stable handles instead of forcing every team to invent sidecar storage.

curl https://critique.sh/api/v1/coding-agent/runs \
  -H "Authorization: Bearer crt_..." \
  -H "Idempotency-Key: intake-bug-742" \
  -H "Content-Type: application/json" \
  -d '{
    "repository": "acme/web",
    "title": "Fix Stripe webhook verification",
    "tags": ["ci", "payments", "owner:platform"],
    "metadata": {
      "ticket": "PAY-742",
      "source": "github-actions"
    },
    "prompt": "Add Stripe webhook signature verification and regression tests.",
    "modelId": "anthropic/claude-sonnet-4.6",
    "billing": { "mode": "managed" },
    "publish": { "mode": "draft_pr" },
    "validationMode": "tests",
    "safety": {
      "network": { "mode": "restricted", "allowlist": ["api.stripe.com"] },
      "resources": { "maxTurns": 3, "maxCredits": 12 }
    }
  }'

Three pieces matter here. First, attribution lives on the run itself: title, tags, metadata. Second, the client gets stable lifecycle hints from the API rather than guessing whether a warm session is still usable. Third, OpenAPI now describes the actual surface, which means internal SDK generation stops lagging behind the shipped routes.

What we copied from the market, and what we did not

The right move in a fast market is not blind originality. Cursor, Devin, Codex, and Factory each make part of the shape obvious. Run-based APIs are good. Session attribution is good. Tags are good. Resumability is good. Explicit cloud-task context is good. We borrowed the parts that make an API more operable.

We did not try to copy everything. Devin’s broader session model includes service-user impersonation, secrets, knowledge, playbooks, and detailed session insights. That is a real product surface, but it also drags security, RBAC, tenancy, and enterprise admin consequences with it. We are not pretending those concerns disappear because a field looks easy to add. The current Critique update stays inside a narrower contract we can defend: repo-scoped execution with clearer lifecycle and provenance.

What this update adds now vs what stays for a later layer

Area	Now	Later, if it earns its complexity
Run control	Lifecycle fields, SSE status payloads, clean fallback behavior	Queue policies, richer scheduling, retries beyond current delivery model
Attribution	Title, tags, metadata, deterministic intent classification	Cross-org impersonation, service-user identity layers, tenant policy inheritance
Context	Repository, prompt, safety policy, publish mode, follow-up turns	Knowledge packs, secret catalogs, playbooks, wider preconfigured capability bundles
Governance	Coding Agent API for writing; Review runs and Merge Gate API for judging	Tighter policy coupling across the full writer-judge loop

The bigger point: writer APIs are not judge APIs

A lot of the cloud-agent market still talks as if the same system should both write the code and certify that the code should ship. Sometimes that is fine for low-risk work. It is not a strong default for production engineering teams.

Critique’s position remains the same. The Coding Agent API is the writer surface. It clones a repo, executes in a sandbox, and can open a PR. Review runs, Change Passports, and the Merge Gate API are the judge surface. The reason this update matters is that better lifecycle and provenance on the writer side make the handoff to the judge side much cleaner.

Partly. The overlap is real at the HTTP contract layer. But Critique is still narrower than a full session platform or IDE replacement. The stronger position is cloud repo execution plus a clear path into merge-grade review.

No. The run contract became richer without introducing a separate CodingAgentRun table. Attribution is persisted through Builder job events and surfaced through the API response.

Because operators and dashboards need a cheap first pass at “what kind of task was this?” before they open the full transcript. It is deterministic and intentionally modest, not a claim of deep semantic understanding.

Because the runtime should not fork just because the entry point changes. Browser UI and HTTP automation can share OpenCode + E2B while exposing different control surfaces on top.

Use the API like infrastructure

Generate a crt_ key, wire one concrete workflow first, and treat lifecycle, tags, metadata, and nextActions as the beginning of your agent control plane — not as decorative response fields.

Open Coding Agent API

Primary sources

Critique Coding Agent API docs

Current REST and SSE surface

Critique Coding Agent API page

Examples, key flow, and positioning

Persistent sessions essay

Warm-session model on the same run id

Cursor Cloud Agents API

Run-based cloud-agent contract

Devin API overview

Service users, org API, session attribution

Devin create session

Tags, resumable sessions, knowledge, secrets, playbooks

Devin session insights

Analysis and session reporting

OpenAI Codex GitHub integration

Cloud task from pull-request context

Factory docs

Exec, cloud templates, agent readiness, automation surfaces

Compare Critique

Compare the main AI code review options.

If this article is part of a buying process, these pages compare Critique with the tools most teams evaluate for GitHub PR review.

Best AI code review tools AI code review pricing

← All essays Privacy & Terms

Ask about this essay

Nemotron-3-Super

Ask about the argument, the evidence, the structure, or how the post connects to Critique.

Not editorial advice · The essay above is the source of truth · Not saved to your account · OpenRouter privacy