Model updateMay 29, 202622 min readCritique

DeepSeek & MiMo Are 0.5 Credits Forever — Plus Opus 4.8, Qwen3.7-Max, and 2026's Cheapest Frontier Review Stack

Permanent catalog cuts on DeepSeek V4 and MiMo, plus Opus 4.8, Ring-2.6-1T, Gemini 3.5 Flash, Grok Build, and Step 3.7 Flash — with benchmarks that explain why the price war finally matters for PR review.

0cr

Permanent floor — DeepSeek V4 Flash & MiMo v2.5

0cr

Permanent floor — DeepSeek V4 Pro & MiMo v2.5 Pro

Max MiMo API cut — Xiaomi permanent repricing (May 27)

+100

One-time reader credits at the end of this essay

If you have been waiting for “good enough” review to cost less than coffee, this is the release. The mental model we want you to leave with is anchoring in reverse: flagship models still exist for the PR where one missed security bug costs more than a month of credits — but the median PR should not touch them. The median PR should ride open-weight lanes that now score within striking distance of Sonnet-class SWE-bench Verified numbers at 1/20th the credit burn.

Launch window — treasury pricing
Half-price DeepSeek V4 lanes for both Flash and Pro.Anthropic Sonnet and Opus tiers still excel on frontier agentic dashboards — Critique bundles those models everywhere you expect them — DeepSeek sits in a different place on the value curve. When the watchdog and specialist graph can call Flash on a cadence measured in tens of seconds, predictable credit floors matter as much as raw Elo.
DeepSeek
Permanent
DeepSeek V4 Flash
0.5 cr1 cr
Ends No expiry
DeepSeek
Permanent
DeepSeek V4 Pro
1 cr3 cr
Ends No expiry
XiaomiMiMo
Permanent
MiMo v2.5
0.5 cr1.5 cr
Ends No expiry
XiaomiMiMo
Permanent
MiMo v2.5 Pro
1 cr3 cr
Ends No expiry
AntGroup
Permanent
Ling-2.6-Flash
0.5 cr1 cr
Ends No expiry

Why MiMo landed at 0.5 credits the same week Xiaomi rewrote API pricing

This is not Critique inventing a promo lane in isolation. On May 27, Xiaomi permanently renovated the entire MiMo-V2.5 pricing system: flat per-token rates with no more input-length multipliers, cuts of up to 99% versus the old 256K–1M tiers, Token Plan quotas reset and expanded 5–8×, and a public write-up on how they kept costs down — SWA with SGLang HiCache shrinking KV-cache churn, better expert-parallel bucketing, and higher cache hit rates on long agent runs. Read their announcement in Primary sources below; our 0.5cr / 1cr MiMo floors are the downstream bet that those economics should be the default for PR review, not a weekend experiment.

Critique.shLive · Updated just now

New & repriced models in this drop

Provider icons via LobeHub. Credit floors are per review slice — depth and specialists still multiply total burn.

13 models88.6% top benchmark score0.5 cr lowest floor

Claude Opus 4.8

SWE-Bench Verified

88.6%

37 cr

DeepSeek V4 Pro

Max

SWE-Bench Verified

80.6%

1 cr

Qwen3.7-Max

SWE-Bench Verified

80.4%

6 cr

DeepSeek V4 Flash

Max

SWE-Bench Verified

79.0%

0.5 cr

MiMo v2.5 Pro

SWE-Bench Verified

78.9%

1 cr

Ring-2.6-1T

SWE-Bench Verified

74.0%

1.5 cr

Ling-2.6-Flash

SWE-Bench Verified

61.2%

0.5 cr

StepFun 3.7 Flash

SWE-Bench Pro

56.3%

1 cr

MiniMax M2.7

SWE-Bench Pro

56.2%

1.5 cr

MiMo v2.5

SWE-Bench Pro

56.1%

0.5 cr

Gemini 3.5 Flash

SWE-Bench Pro

55.1%

10 cr

Grok Build 0.1

Benchmark

Agentic coding (xAI)

3 cr

Opus 4.8 (Fast)

Benchmark

Same weights, 2× credits

74 cr

SWE-bench scores reflect best observed performance on the toughest real-world coding tasks.

All scores are relative.

DeepSeek V4: flagship open weights at volume pricing

DeepSeek’s V4 family was already the rational default for teams that wanted MoE scale and a million-token context class without renting Claude for every specialist pass. What changed is the price story. V4 Flash at 0.5 credits is not “cheap for a demo.” At roughly 1M input + 150k output per credit unit, a deep 5M-token review pass on Flash can land near 2.5 credits — less than a single old-system unit on a mid-tier model. V4 Pro at 1 credit is the open-weight lead for messy PRs: Artificial Analysis quotes GDPval-AA near 1554 Elo on the Pro Max reasoning profile, leading the open-weights pack on agentic work tasks.

SWE-bench Verified — open-weight value stack

Higher is better. Vendor-reported scores; harnesses differ across labs.

Claude Opus 4.888.6% resolved
DeepSeek V4 Pro80.6% resolved
Qwen3.7-Max80.4% resolved
MiMo v2.5 Pro78.9% resolved
DeepSeek V4 Flash79% resolved
Ring-2.6-1T74% resolved
Ling-2.6-Flash61.2% resolved

Sources: Anthropic Opus 4.8 launch, DeepSeek Hugging Face cards, Qwen3.7 agent blog, Xiaomi MiMo Pro card, InclusionAI Ring/Ling HF evals.

Price-to-performance
DeepSeek V4 Pro vs Claude Sonnet 4.6 on Critique creditsSonnet is still excellent. The buying question is whether the last few SWE-V points are worth 22× the floor on every lead pass.
Metric
DeepSeekDeepSeek V4 Pro
ClaudeClaude Sonnet 4.6
critique.sh credit floor
1 cr (V4 Pro)
22 cr (Sonnet 4.6)
SWE-bench Verified (vendor)
80.6%
79.6%
GDPval-AA Elo (AA, Apr 2026)
1554
~1600 class
Best for
Default lead on cost-sensitive repos
Policy-mandated Anthropic lane

Xiaomi MiMo: the other half of the 0.5cr revolution

MiMo v2.5 at 0.5 credits is the parallel bet to Gemma and Ling in our volume tier: Xiaomi positions the Flash lane for high-throughput agent loops, and the tech report cites 73.4% on SWE-bench Verified for the Flash profile while keeping active parameters tiny enough to run wide specialist fan-out. MiMo v2.5 Pro at 1 credit is the escalation lane inside the same family — 78.9% SWE-V in the official card, with Terminal-Bench 2.0 in the high-60s. The vendor-side story matters too: in their May 2026 price-adjustment post, Xiaomi says they “permanently renovate the entire model pricing system,” drop context-length surcharges, and fund the cut with real inference engineering — not a time-boxed coupon. Critique passes that through as permanent catalog pricing so the median PR does not need a flagship model to get a serious second opinion. If your team has been mentally bucketed into “cheap Chinese models = toy reviewers,” update the bucket: the scores crossed the line where selective review becomes irrational.

Xiaomi vs frontier tax
MiMo v2.5 Pro vs GPT-5.4 MiniBoth are “serious enough” for many PRs. One costs 1 credit; the other costs 6.
Metric
XiaomiMiMoMiMo v2.5 Pro
OpenAIGPT-5.4 Mini
critique.sh floor
1 cr
6 cr
SWE-bench Verified
78.9%
73.0% (Vals on 5.4 mini)
Context class
1M (vendor)
128K–1M (route-dependent)
Vendor pricing story
Flat 1M-context API (May 27)
Usage-tier mini model

Frontier additions — Opus 4.8, Qwen3.7-Max, Gemini 3.5 Flash

Claude Opus 4.7 leaves the catalog; Opus 4.8 takes its Ultra slot at 37 credits with a 1M-token context window and Anthropic’s published 88.6% SWE-bench Verified. Opus 4.8 (Fast) uses the same weights at 74 credits — double the floor for teams that buy latency, not capability. Qwen3.6-Max-Preview retires in favor of Qwen3.7-Max at 6 credits: Alibaba’s agent blog cites 80.4% SWE-V and 60.6% SWE-Pro, a cleaner mid-flagship than the old 8cr preview lane. Gemini 3.5 Flash lands at 10 credits as Google’s “near-Pro coding at Flash economics” bet — 55.1% SWE-Pro public, 76.2% Terminal-Bench 2.1 in DeepMind materials.

Agent & coding lanes
New specialist-sized models worth routingNot every model should be your lead. These are the slots we expect in specialist grids and Remedy picks.
Metric
Model
Why it exists on Critique
Grok Build 0.1 (3 cr)
xAI coding agent
Fast tool-use model for interactive fix loops; pairs with Remedy when you want xAI flavor.
Ring-2.6-1T (1.5 cr)
InclusionAI 63B active MoE
SWE-V 74% at Ring pricing — thinking model for tool-heavy agents without Qwen flagship cost.
StepFun 3.7 Flash (1 cr)
196B MoE, 11B active
Replaces 3.5 Flash; native multimodal + 256K context for repos with UI screenshots in PRs.
MiniMax M2.7 (1.5 cr)
Was 2 cr
M2.5 removed; M2.7 is the MiniMax lane now at a lower permanent credit floor.

How to rebuild your policy stack Monday morning

Routing checklist
1Default lead for volume?
deepseek/deepseek-v4-pro at 1 cr, or deepseek/deepseek-v4-flash at 0.5 cr if PRs are small.
2Default specialist grid?
Mix ling-2.6-flash, mimo-v2.5, deepseek-v4-flash, ring-2.6-1t — all sub-2cr before depth multipliers.
3When to escalate to Opus 4.8?
Auth, billing, migrations, or incident-linked PRs. Use Opus 4.8 Fast only when wall-clock dominates invoice.
4Remedy default?
Still Qwen3.7 Plus (free model cost) for lint-level fixes; escalate to MiMo Pro or V4 Pro when validation fails.

What we removed or aliased

Existing repo policies keep working — IDs map forward.

Old ID	New target
minimax/minimax-m2.5	minimax/minimax-m2.7
stepfun/step-3.5-flash	stepfun/step-3.7-flash
qwen/qwen3.6-max-preview	qwen/qwen3.7-max
anthropic/claude-opus-4.7	anthropic/claude-opus-4.8
:nitro suffixes	Stripped for billing; legacy speed suffixes no longer apply

Frequently asked

Yes. Critique set permanent credit floors — not a promo window. DeepSeek V4 Flash and MiMo v2.5 bill at 0.5 credits; DeepSeek V4 Pro and MiMo v2.5 Pro bill at 1 credit. The public /models page and this essay reflect the same numbers as billing.

Permanent floor cuts on DeepSeek V4 and MiMo lanes, plus new models: Claude Opus 4.8 (+ Fast), Qwen3.7-Max, Gemini 3.5 Flash, Grok Build 0.1, Ring-2.6-1T, StepFun 3.7 Flash (replacing 3.5), and MiniMax M2.7 at a lower floor. Legacy IDs alias forward so saved repo policies keep working.

Use deepseek/deepseek-v4-flash or xiaomi/mimo-v2.5 at 0.5 credits for small diffs; deepseek/deepseek-v4-pro or mimo-v2.5-pro at 1 credit for heavier lanes. Reserve Opus 4.8 for auth, billing, migrations, or incident-linked changes. The decision checklist earlier in this essay has a full routing table.

No. Reader credit bonuses are paused. Chat remains free for signed-in users, while review, Remedy, Builder, and agent runs use paid credits or eligible billing.

Open the model guide

Every floor, plan gate, and Ultra slot lives on the public models page — same numbers as billing.

Browse models →

Primary sources

Xiaomi MiMo-V2.5 permanent API price adjustment

Flat 1M-context rates, up to 99% cuts, Token Plan reset, inference optimizations

Compare Critique

Compare the main AI code review options.

If this article is part of a buying process, these pages compare Critique with the tools most teams evaluate for GitHub PR review.

Best AI code review tools AI code review pricing

← All essays Privacy & Terms

Ask about this essay

Nemotron-3-Super

Ask about the argument, the evidence, the structure, or how the post connects to Critique.

Not editorial advice · The essay above is the source of truth · Not saved to your account · OpenRouter privacy