DeepSeek & MiMo Are 0.5 Credits Forever — Plus Opus 4.8, Qwen3.7-Max, and 2026's Cheapest Frontier Review Stack
Permanent catalog cuts on DeepSeek V4 and MiMo, plus Opus 4.8, Ring-2.6-1T, Gemini 3.5 Flash, Grok Build, and Step 3.7 Flash — with benchmarks that explain why the price war finally matters for PR review.
If you have been waiting for “good enough” review to cost less than coffee, this is the release. The mental model we want you to leave with is anchoring in reverse: flagship models still exist for the PR where one missed security bug costs more than a month of credits — but the median PR should not touch them. The median PR should ride open-weight lanes that now score within striking distance of Sonnet-class SWE-bench Verified numbers at 1/20th the credit burn.
Half-price DeepSeek V4 lanes for both Flash and Pro.
Anthropic Sonnet and Opus tiers still excel on frontier agentic dashboards — Critique bundles those models everywhere you expect them — DeepSeek sits in a different place on the value curve. When the watchdog and specialist graph can call Flash on a cadence measured in tens of seconds, predictable credit floors matter as much as raw Elo.
Why MiMo landed at 0.5 credits the same week Xiaomi rewrote API pricing
This is not Critique inventing a promo lane in isolation. On May 27, Xiaomi permanently renovated the entire MiMo-V2.5 pricing system: flat per-token rates with no more input-length multipliers, cuts of up to 99% versus the old 256K–1M tiers, Token Plan quotas reset and expanded 5–8×, and a public write-up on how they kept costs down — SWA with SGLang HiCache shrinking KV-cache churn, better expert-parallel bucketing, and higher cache hit rates on long agent runs. Read their announcement in Primary sources below; our 0.5cr / 1cr MiMo floors are the downstream bet that those economics should be the default for PR review, not a weekend experiment.
New & repriced models in this drop
Provider icons via LobeHub. Credit floors are per review slice — depth and specialists still multiply total burn.
Claude Opus 4.8
SWE-bench
88.6%
DeepSeek V4 Pro
Max
SWE-bench
80.6%
Qwen3.7-Max
SWE-bench
80.4%
DeepSeek V4 Flash
Max
SWE-bench
79.0%
MiMo v2.5 Pro
SWE-bench
78.9%
Ring-2.6-1T
SWE-bench
74.0%
Ling-2.6-Flash
SWE-bench
61.2%
StepFun 3.7 Flash
SWE-bench
56.3%
MiniMax M2.7
SWE-bench
56.2%
MiMo v2.5
SWE-bench
56.1%
Gemini 3.5 Flash
SWE-bench
55.1%
Grok Build 0.1
Benchmark
Agentic coding (xAI)
Opus 4.8 (Fast)
Benchmark
Same weights, 2× credits
SWE-bench scores reflect best observed performance on the toughest real-world coding tasks.
All scores are relative.
DeepSeek V4: flagship open weights at volume pricing
DeepSeek’s V4 family was already the rational default for teams that wanted MoE scale and a million-token context class without renting Claude for every specialist pass. What changed is the price story. V4 Flash at 0.5 credits is not “cheap for a demo.” At roughly 1M input + 150k output per credit unit, a deep 5M-token review pass on Flash can land near 2.5 credits — less than a single old-system unit on a mid-tier model. V4 Pro at 1 credit is the open-weight lead for messy PRs: Artificial Analysis quotes GDPval-AA near 1554 Elo on the Pro Max reasoning profile, leading the open-weights pack on agentic work tasks.
Higher is better. Vendor-reported scores; harnesses differ across labs.
Sources: Anthropic Opus 4.8 launch, DeepSeek Hugging Face cards, Qwen3.7 agent blog, Xiaomi MiMo Pro card, InclusionAI Ring/Ling HF evals.
DeepSeek V4 Pro vs Claude Sonnet 4.6 on Critique credits
Sonnet is still excellent. The buying question is whether the last few SWE-V points are worth 22× the floor on every lead pass.
Xiaomi MiMo: the other half of the 0.5cr revolution
MiMo v2.5 at 0.5 credits is the parallel bet to Gemma and Ling in our volume tier: Xiaomi positions the Flash lane for high-throughput agent loops, and the tech report cites 73.4% on SWE-bench Verified for the Flash profile while keeping active parameters tiny enough to run wide specialist fan-out. MiMo v2.5 Pro at 1 credit is the escalation lane inside the same family — 78.9% SWE-V in the official card, with Terminal-Bench 2.0 in the high-60s. The vendor-side story matters too: in their May 2026 price-adjustment post, Xiaomi says they “permanently renovate the entire model pricing system,” drop context-length surcharges, and fund the cut with real inference engineering — not a time-boxed coupon. Critique passes that through as permanent catalog pricing so the median PR does not need a flagship model to get a serious second opinion. If your team has been mentally bucketed into “cheap Chinese models = toy reviewers,” update the bucket: the scores crossed the line where selective review becomes irrational.
MiMo v2.5 Pro vs GPT-5.4 Mini
Both are “serious enough” for many PRs. One costs 1 credit; the other costs 6.
Frontier additions — Opus 4.8, Qwen3.7-Max, Gemini 3.5 Flash
Claude Opus 4.7 leaves the catalog; Opus 4.8 takes its Ultra slot at 37 credits with a 1M-token context window and Anthropic’s published 88.6% SWE-bench Verified. Opus 4.8 (Fast) uses the same weights at 74 credits — double the floor for teams that buy latency, not capability. Qwen3.6-Max-Preview retires in favor of Qwen3.7-Max at 6 credits: Alibaba’s agent blog cites 80.4% SWE-V and 60.6% SWE-Pro, a cleaner mid-flagship than the old 8cr preview lane. Gemini 3.5 Flash lands at 10 credits as Google’s “near-Pro coding at Flash economics” bet — 55.1% SWE-Pro public, 76.2% Terminal-Bench 2.1 in DeepMind materials.
New specialist-sized models worth routing
Not every model should be your lead. These are the slots we expect in specialist grids and Remedy picks.
How to rebuild your policy stack Monday morning
- 1Default lead for volume?deepseek/deepseek-v4-pro at 1 cr, or deepseek/deepseek-v4-flash at 0.5 cr if PRs are small.
- 2Default specialist grid?Mix ling-2.6-flash, mimo-v2.5, deepseek-v4-flash, ring-2.6-1t — all sub-2cr before depth multipliers.
- 3When to escalate to Opus 4.8?Auth, billing, migrations, or incident-linked PRs. Use Opus 4.8 Fast only when wall-clock dominates invoice.
- 4Remedy default?Still Qwen3.6 Plus (free model cost) for lint-level fixes; escalate to MiMo Pro or V4 Pro when validation fails.
Existing repo policies keep working — IDs map forward.
| Old ID | New target |
|---|---|
| minimax/minimax-m2.5 | minimax/minimax-m2.7 |
| stepfun/step-3.5-flash | stepfun/step-3.7-flash |
| qwen/qwen3.6-max-preview | qwen/qwen3.7-max |
| anthropic/claude-opus-4.7 | anthropic/claude-opus-4.8 |
| :nitro suffixes | Stripped for billing; legacy speed suffixes no longer apply |
Frequently asked
Read the drop, claim the credits
+100 credits for reading this drop
Signed-in Critique users get a one-time +100 bonus credits for reading this catalog spring essay. Use them to A/B DeepSeek and MiMo lanes against your current lead — the floors are permanent; this bonus is our way of paying for your experiment time.
Open the model guide
Every floor, plan gate, and Ultra slot lives on the public models page — same numbers as billing.
Browse models →