Short answer (July 2026): Claude Sonnet 5 is Anthropic's new mid-tier model, positioned as a cheaper way to run agents — near-Opus quality on coding and agentic tasks at Sonnet prices. It leads GPT-5.5 on SWE-bench Pro (63.2% vs 58.6%), costs half as much on output ($15 vs $30 per million tokens, with a $2/$10 intro rate through August 31, 2026), and — crucially — you can actually use it today, while GPT-5.6 is still locked in a limited preview.
- Best for: cost-efficient coding agents and high-volume agentic work
- Headline win: SWE-bench Pro 63.2% (ahead of GPT-5.5)
- Price: $3/$15 per M tokens ($2/$10 intro through Aug 31, 2026)
- Availability edge: generally available now, unlike GPT-5.6
At a glance
| Attribute | Claude Sonnet 5 |
|---|---|
| Context window | 1M tokens |
| Max output | 128K tokens |
| Pricing | $3 input / $15 output per M ($2/$10 intro through 2026-08-31) |
| SWE-bench Pro | 63.2% |
| Thinking | Adaptive, on by default |
| Effort levels | low / medium / high / xhigh / max |
| Vision | High-resolution (up to 2576px long edge) |
Why "cheaper way to run agents" is the whole story
Agents are token-hungry: they loop, call tools, read results, and think between steps. At scale, the model's output price dominates your bill. Sonnet 5 is priced at exactly half GPT-5.5's $30-per-million output rate, and the introductory $2/$10 rate (through August 31, 2026) makes it cheaper still.
Pair that with availability. GPT-5.6 launched in late June 2026 but only as a restricted preview — most developers can't build on it yet. Sonnet 5 is generally available right now. Being the capable model people can actually deploy, at a low price, is a real competitive advantage that benchmarks alone don't capture.
The coding benchmarks
Leadership is benchmark-dependent — this isn't a clean sweep:
| Benchmark | Claude Sonnet 5 | GPT-5.5 |
|---|---|---|
| SWE-bench Pro (agentic coding) | 63.2% | 58.6% |
| Terminal-Bench 2.1 | 80.4% | 83.4% |
Sonnet 5 wins on SWE-bench Pro; GPT-5.5 edges it on Terminal-Bench 2.1. For reference, Anthropic's flagship Opus 4.8 scores around 69.2% on agentic coding — so Sonnet 5 closes much of the gap to the top tier at a fraction of the cost. Developers are treating this as a cost-efficiency vs specialized-performance trade-off, not a knockout.
What's new under the hood
-
Adaptive thinking on by default. Omit the thinking parameter and Sonnet 5 runs adaptive thinking (Sonnet 4.6 ran thinking-off by default) — better answers, but budget
max_tokensfor the added thinking spend. -
New tokenizer (~30% more tokens). The same text tokenizes to roughly 30% more tokens than on Sonnet 4.6. Per-token pricing is unchanged, but re-baseline cost and
max_tokens— a limit tuned for 4.6 can now truncate. -
xhigheffort. The first Sonnet-tier model with thexhigheffort level — the recommended setting for the hardest coding and agentic tasks. - High-resolution vision. Up to 2576px on the long edge, useful for screenshots, diagrams, and document understanding.
Which should you choose?
- High-volume or cost-sensitive agents → Sonnet 5 (and lean on the $2/$10 intro pricing).
- Absolute top coding capability, cost no object → Claude Opus 4.8 (or Claude Fable 5, back online after its export-control suspension).
- You need Terminal-Bench-style task performance → evaluate GPT-5.5 on your workload.
- You need something usable today → Sonnet 5, while GPT-5.6 is still gated.
Frequently asked questions
How much does Claude Sonnet 5 cost?
$3 input / $15 output per million tokens, with an introductory rate of $2/$10 per million through August 31, 2026 — half GPT-5.5's output price.
Is Claude Sonnet 5 better than GPT-5.5 for coding?
On SWE-bench Pro, yes (63.2% vs 58.6%). GPT-5.5 edges it on Terminal-Bench 2.1 (83.4% vs 80.4%). The better choice depends on which axis your work weights — and Sonnet 5 is cheaper on both counts.
Does Claude Sonnet 5 support a 1M-token context window?
Yes — 1M tokens of context and up to 128K tokens of output.
Should I switch from Sonnet 4.6 to Sonnet 5?
For coding and agentic work, likely yes — but note the new tokenizer produces ~30% more tokens for the same text, so re-baseline your max_tokens and cost estimates, and expect adaptive thinking on by default.
Conclusion
Sonnet 5's pitch is simple and sharp: near-flagship agentic coding, at half the output price, available right now. While the frontier fights over benchmark decimals behind preview gates, Anthropic shipped the model most teams can afford to actually run in production. Are you moving your agents to Sonnet 5? Tell us why (or why not) below.
Top comments (0)