15 min readCritique

Cloud Coding Agents: What Changed, What Stayed the Same, and Where Remedy Fits

The model is not the whole story anymore. The real stack is model loop, sandbox, git state, validation, and a control plane that decides what “correct” means.

Cloud coding agents · Remedy · BYOA

The agent no longer lives on your laptop.
The real question is who owns the control plane.

GitHub, Cursor, Jules, Devin, OpenCode, E2B, and Vercel Sandbox are all converging on the same primitives: model loop, isolated execution, git state, and asynchronous delivery. Remedy fits that stack differently: not as a general “AI employee,” but as a review-attached execution layer with BYOA support.

Execution shape
Review verdict
Remedy blueprint
Sandbox execution
Branch / PR handoff
Models we expose in Critique
Z.aiGLM-5
KimiKimi-K2.5
MinimaxMiniMax-M2.7
GoogleGemini-3.1-Pro

Review lead, specialist, and Remedy execution can be separated. BYOA means teams can keep their preferred agent runtime while Critique owns the review signal and fix blueprint.

A cloud coding agent is not magic. Strip away the product gloss and most of the category is the same machine: the model plans, tools read and write state, execution happens in somebody else’s sandbox, git and validation provide feedback, and the end result is delivered as a branch, a PR, or a recorded attempt. The interesting competition is not over whether this pattern exists. It is over who owns the control plane and what trust boundary they choose.

That distinction matters because the same runtime ingredients can produce very different products. GitHub uses Actions-adjacent environments and platform-native permissions. Cursor uses editor-driven cloud agents and branch workflows. Autonomous “software engineer” products frame the whole experience as open-ended delegation. Infrastructure vendors like E2B, OpenCode, and Vercel Sandbox sell the execution substrate itself. Remedy fits none of those exactly. It is closer to a review-triggered execution layer than a general-purpose AI employee.

PART ONE

What A Cloud Coding Agent Actually Is

Generic cloud agent loop
Plan taskRead repo + issue contextRun tools in isolated computeValidate with tests / buildOpen branch or PR
Remedy loop
Review findings existCompile Remedy blueprintRun OpenCode / E2B executionValidate bounded fix attemptRecord result + hand back
Model loop
Planning, editing, retrying, and selecting tools.
Sandbox
Container, VM, microVM, or managed cloud dev box isolated from your laptop.
Git truth
Branches, tests, builds, and commits decide whether the run was real.
Async control
You delegate now and reconcile later through a run record, branch, or PR.

PART TWO

A Practical Map Of The Market

Control planes
  • GitHub-native agents: policy, identity, and workflow depth.
  • IDE-born agents like Cursor: editor as delegation surface.
  • Session-first engineer products: long-horizon task ownership.
  • Headless agent servers like OpenCode: automation engine rather than glossy UI.
Infrastructure primitives
  • Sandbox providers like E2B for on-demand isolated Linux execution.
  • MicroVM infrastructure like Vercel Sandbox for untrusted code.
  • Git credentials, package registries, and network policy.
  • Validation surface: tests, build, lint, deploy checks.

Those layers get bundled differently. GitHub’s value is native workflow gravity. Cursor’s value is authoring velocity with cloud continuation. Devin-style products sell the dream of a session that keeps going. E2B and Vercel Sandbox sell the compute substrate. OpenCode sells a headless agent engine you can drive from another product. Critique does not need to replace all of them. It composes the pieces that matter for review and bounded execution.

PART THREE

The Design Tensions Everyone Hits

Broad delegation

Great when the task starts from human intent and you want the agent to discover everything else. Harder to govern, audit, and bound across retries.

Core tension

Trust, cost, scope, and auditability all fight each other. The more open-ended the task, the more you need policy, checkpoints, and explainable validation.

Bounded execution

Stronger when intent is already structured. You trade some flexibility for determinism, replayability, and a cleaner paper trail about why the agent touched the code at all.

This is why “agent mode everywhere” is not automatically “safe auto-merge everywhere.” Every serious team eventually asks the same questions: what files can it touch, how many times can it retry, what secrets does the sandbox see, how do we price model plus compute plus re-review, and can we explain the change in a way a reviewer will trust? Those questions are not vendor-specific. They are the category.

PART FOUR

Where Remedy Fits

Remedy is not trying to win the same primary interface as a general cloud coding agent. In Critique, review is the upstream control plane. Findings, severity, file anchors, and confidence become structured inputs. Remedy compiles that into a blueprint, persists a RemedyRun and RemedyAttempt, and executes through OpenCode today, with E2B-backed isolation available when configured. That puts Remedy on the same infrastructure plane as the broader market while keeping a different product contract.

ReviewRun
The upstream object that anchors what needs fixing and why.
Blueprint
Structured handoff instead of a loose conversational prompt.
OpenCode
Current headless execution engine for Remedy runs.
E2B
Optional sandbox substrate when isolated cloud execution is enabled.

PART FIVE

General Cloud Agents Versus Remedy

Trigger
General cloud agent

Starts from human language: fix this issue, investigate CI, implement this spec, or respond to a work item.

Trigger
Remedy

Starts from a review artifact: structured findings, bounded goals, and a policy-aware execution request.

Execution surface
General cloud agent

Usually owns the whole session, context chase, and branch workflow inside the vendor’s cloud environment.

Execution surface
Remedy

Uses OpenCode plus optional E2B isolation as a controlled worker, preserving traceability back to the original review run.

That distinction changes who should buy what. If your team wants open-ended delegation from the IDE or a GitHub issue, cloud agents are the natural category. If your team already believes quality is a pipeline and wants findings to become bounded fix attempts with explicit validation, Remedy is the sharper tool.

PART SIX

The Model Layer Matters Less Than The Routing Layer

The market still loves to argue about which foundation model “wins,” but production systems rarely work that way. The durable advantage comes from separating roles. Stronger synthesis models can act as review lead. Faster or cheaper models can power specialists. Execution models can be different again. Critique already exposes that kind of separation across review and Remedy, which is why model diversity is a feature, not a branding exercise.

Examples of model roles teams can route in Critique

These are not benchmark scores. They show the kind of portfolio thinking that matters more than one universal “best model.”

“Fit” here is editorial framing for role suitability in the stack, not vendor-verified performance data.

CLOSING

Pick The Abstraction That Matches Your Org

Cloud coding agents are converging on the same primitives: isolated compute, tool loops, git, validation, and asynchronous execution. The real differentiation is control plane and trust boundary. GitHub, Cursor, Jules, Devin, OpenCode, E2B, and Vercel Sandbox all tell different stories with the same underlying ingredients.

Remedy belongs in that conversation, but not as a generic “AI staff engineer in the cloud.” It is a review-attached execution layer with exportable blueprints, bounded fix loops, BYOA support, and infrastructure choices that align with how serious teams already govern code change. Judge it alongside PR review automation and policy-bound fix pipelines, not only against open-ended delegation products.

Critique turns review into a control plane.

Connect GitHub, tune review policy, route models like GLM-5, Kimi-K2.5, MiniMax-M2.7, and Gemini-3.1-Pro, then use Remedy or BYOA execution to move from findings to bounded fixes.

Get started

Ask about this essay

Nemotron-3-Super
Ask about the argument, the evidence, the structure, or how the post connects to Critique.
Not editorial advice · The essay above is the source of truth · Not saved to your account · OpenRouter privacy