Skip to content
16 min readCritique

Is critique.sh Safe for Private Repositories?

A code-level privacy and security review of GitHub App access, E2B sandbox teardown, persisted artifacts, chat retention, and provider logging boundaries.

4
E2B execution paths reviewed: deterministic analysis, sandbox-native review, Remedy, and Builder.
0
GitHub installation tokens found persisted in the database; they are resolved just in time for sandbox work.
AES-256-GCM
Encryption used for repository secrets and saved user OpenRouter keys at rest.
finally
Every E2B path reviewed tears down the sandbox through a finally block that calls sandbox.kill().

Read the governing docs

Start with the legal pages, not a slogan.

If you are evaluating Critique for a real team, read the product policy pages alongside the implementation. The product already has a dedicated Privacy Policy and Terms surface. Use those as the contract layer, then use the architecture below as the engineering layer.

The real question is more operational: what exactly gets access, where does the code travel, what persists after the run, and who can see the result? Private-repo trust is never a one-word answer. It is a chain of control decisions.

This codebase gives a fairly inspectable chain. GitHub sends signed webhook deliveries. Critique verifies the signature before accepting them. Review work runs asynchronously through QStash. Repository access in automation uses GitHub App installation tokens. Sandbox-backed review and Remedy work run in E2B and are explicitly killed at the end of the run. Secrets for sandbox injection are encrypted at rest. Those are real, code-backed safety properties.

GitHub sees Critique as a GitHub App that reads repository and pull-request context, then writes back review state. In code, that means installation-scoped Octokit clients from lib/github/app.ts, review check runs opened and completed in lib/github/service.ts, and PR reviews or fallback issue comments published from the same service module.

Concretely, GitHub sees check runs, review summaries, inline comments when they map cleanly to changed right-side diff lines, and any optional Remedy pushes that are explicitly allowed by policy. Critique is not hiding in the repo as a credential helper or local IDE plugin. It sits next to GitHub as an app-level actor and writes through GitHub APIs.

That is an important trust property. The thing writing to your pull request is the GitHub App identity, not an opaque browser automation trick and not a long-lived personal access token hardcoded in the repo.

Users see the parts of the system that are meant to be inspectable: the PR check named Critique / Review, the review body, inline review comments where line mapping succeeds, and the in-app review-run detail pages. The introduction page in this repo describes that split directly: GitHub gets the check and review surface; the app keeps users, entitlements, review runs, chat sessions, and policies.

Inside the product, users can also see usage dashboards, live review progress, Builder activity, and legal pages. That is useful for trust, but it also means the product is intentionally not a zero-record system. There is a control plane, and control planes store state.

This is where precision matters most. The webhook route under app/api/github/webhooks/route.ts does not just verify the signature and move on. It persists a DeliveryEvent row with the webhook payload, delivery id, event type, action, installation id, and repository id. The introduction page even says these deliveries are stored so support can trace what happened. That is deliberate retention, not accidental logging.

The Prisma schema shows the broader persisted model. DeliveryEvent stores webhook payload JSON. ReviewRun stores PR-level run state. Finding stores normalized findings. ReviewArtifact stores the evidence pack, normalized findings, annotations, and lead summary. UsageEvent and OpenRouterCall store usage, token, and cost telemetry. BuilderJob and BuilderJobEvent store job prompts, summaries, and structured activity. ChatSession, ChatMessage, and ChatRun store authenticated dashboard chat state.

There is an additional storage layer many buyers miss: repository indexing. The indexing path persists repo snapshots, file metadata, graph nodes, and raw CodeChunk.content rows keyed by commit SHA. That means Critique is capable of durable repository-content storage for retrieval and repo intelligence. If your standard is “code may be processed but never stored,” this repo does not meet that standard.

So if someone asks, “does Critique keep absolutely nothing?”, the code-level answer is no. It keeps real product records. The more accurate answer is that it keeps operational review state, artifacts, and telemetry that are part of the product itself, while still trying to keep execution environments isolated and short-lived.

Webhook ingress is one of the cleaner parts of the system. lib/github/webhooks.ts computes an HMAC with the configured webhook secret and compares it using timingSafeEqual. app/api/github/webhooks/route.ts rejects missing headers, rejects invalid signatures, parses JSON, dedupes repeated deliveries, stores the delivery, then pushes background work onto QStash. That means GitHub is not allowed to drive the expensive review flow directly inside the inbound request path.

That separation matters. It reduces the amount of privileged review work happening during a webhook request, and it makes replay, retries, and processing state auditable rather than hidden in transient server logs.

Repository access in automated flows is anchored on the GitHub App installation token model. lib/sandbox/e2b.ts resolves an installation token by authenticating the Octokit app client as the installation. Sandbox-backed review, Builder, and Remedy then use that token to clone the repository or talk back to GitHub where needed.

That is materially better than a design that depends on a developer dropping a personal token into a settings panel for all background work. The trust story here is app-scoped automation, not human-scoped impersonation by default.

There are multiple E2B-backed paths in this repo, and each creates a sandbox per execution run. Review collector runs can create a sandbox for deterministic evidence gathering. Sandbox-native OpenCode review creates one sandbox for the review run, reuses it for the follow-up turns in that same review session, then kills it. Remedy creates a fresh sandbox for each remedy execution attempt. Builder creates a fresh sandbox per builder job.

Inside those sandboxes, Critique writes local control files such as prompts, schema files, bootstrap scripts, OpenCode config, and output files. It clones the target repository into /workspace, checks out the PR head SHA, optionally injects repo-specific secrets as environment variables, starts a private OpenCode server where needed, and reads the resulting output artifact back into the main application.

That “reads the artifact back” detail is exactly why “no trace exists anywhere” would be inaccurate. The runtime is ephemeral, but the result is intentionally imported into the product as a review artifact, usage ledger rows, board entries, or Builder events.

For the review-agent path, the answer is effectively yes in the product sense. lib/review/agent/e2b-opencode.ts reads the output, usage messages, and diagnostics it needs, then kills the sandbox in a finally block. The review pipeline publishes the completed check run and GitHub review only after that function returns. So the sandbox is not left running while the product is posting the final review back to GitHub.

What I would not say is “the sandbox is killed before the output is ever read,” because that is not what the code does. The application must read the output artifact before it can synthesize and publish the review. The precise claim the code supports is stronger in a different way: the sandbox is killed before the final review publication step, not kept alive after the output has been harvested.

Repository secrets and user OpenRouter keys are not stored in plaintext in the database. lib/secrets/crypto.ts uses AES-256-GCM with a 32-byte base64-encoded key from CRITIQUE_SECRETS_ENCRYPTION_KEY. lib/secrets/repository-secrets.ts and lib/secrets/user-openrouter-key.ts encrypt values before persistence and only decrypt them server-side when the sandbox runtime needs them.

That is the right direction for at-rest handling. It does not mean secrets never enter process memory; they obviously do when a sandbox is being prepared. But it does mean this repo is not storing those values as casual plaintext columns.

The repo has two different stories here, and mixing them up causes confusion. The legal-page assistant is documented in the Privacy Policy as a separate stateless flow that Critique does not store in account chat history. That is a narrow claim about that widget only.

Authenticated dashboard chat is different. The Prisma schema has ChatSession, ChatMessage, ChatRun, and ChatRunEvent, and the Privacy Policy text explicitly says in-product AI chat may be stored according to workspace chat settings. So a blanket sentence like “chats are never saved” would be false for the product as a whole. It is true only for specific surfaces such as the legal-page assistant.

This codebase cannot prove a universal “the AI provider takes zero logs” claim. What it can prove is that Critique itself stores usage and cost telemetry, and that some flows intentionally write usage summaries or OpenRouter call ledger rows. It can also prove that the legal copy treats provider-side logging as a separate question and points readers to provider privacy notices.

That distinction matters. Provider-side zero-retention or logging limits are contractual and vendor-configuration questions, not something a public repo can establish by rhetoric alone. If you need that guarantee for a regulated environment, the right move is to tie product claims to the provider’s privacy terms, plan settings, and data-processing commitments — not to a marketing sentence saying “trust us.”

There is also an implementation nuance worth saying plainly: some direct OpenRouter calls set data_collection: deny, but I did not find an equivalent explicit deny flag in the sandbox OpenCode provider config. That does not prove provider logging is happening, but it does mean the repo is not enforcing a single uniform provider-side no-logging posture across every execution surface.

As an engineering read, yes, this looks like a product that is trying to be safe in the places that matter most for a GitHub App: signature verification, least-privilege token model, explicit async job boundaries, isolated runtime execution, encrypted secret storage, and teardown of the sandbox after the run. That is a serious foundation, not hand-waving.

But safe does not mean invisible, and ephemeral does not mean zero-retention. Critique is a real control plane, and real control planes keep records. The strongest trust posture is not to pretend otherwise. It is to say exactly what is stored, exactly what is ephemeral, exactly what GitHub can see, and exactly where provider-side commitments must come from policy rather than code.

That is the posture this repo can defend: bounded access, explicit runtime cleanup, and inspectable persistence boundaries.

Pilot the safe way

Install Critique on one busy repository first. Keep the check optional, inspect the review artifacts, and promote it only after your team trusts the signal.

Connect GitHub
The code shows strong safety controls for private-repo review: signed GitHub webhooks, GitHub App installation tokens, encrypted at-rest secrets, isolated E2B sandboxes, and explicit sandbox teardown. It is better described as scoped and bounded than as zero-retention.
Yes, in some product paths. Review artifacts can include patches and code context, and repository indexing stores CodeChunk.content rows keyed by snapshot and commit SHA. Critique is not only processing code transiently.
The legal-page assistant is documented as a stateless page-specific flow. Authenticated dashboard chat is different: the schema includes ChatSession, ChatMessage, ChatRun, and ChatRunEvent, so product chat can be stored according to settings.
The E2B-backed review, analysis, Remedy, and Builder paths use finally blocks that call sandbox.kill(). The review-agent path reads the needed output and usage data, kills the sandbox, then the main pipeline publishes the final GitHub review.
I did not find GitHub installation tokens persisted in the database. The sandbox helper resolves installation tokens just in time for GitHub App automation and sandbox cloning.
No. Some direct OpenRouter calls set data_collection: deny, but the sandbox OpenCode provider config does not show the same uniform opt-out. Provider-side zero logging is a policy and configuration claim, not something this repo alone can prove.