# Claude Sonnet 4.6: Opus-Level Intelligence at Sonnet Price

**Plutonous** | February 18, 2026 | 13 min read


Tags: Claude Sonnet 4.6, Anthropic, Computer Use, AI Coding, Claude Code, Context Window, Agentic AI

---

**TL;DR:** Claude Sonnet 4.6, released February 17, 2026, is now the default model on claude.ai and Claude Cowork - and it just made Opus-class performance available at Sonnet pricing. Users in early access preferred Sonnet 4.6 over Sonnet 4.5 **70% of the time** and over Opus 4.5 **59% of the time**<sup><a href="#source-1">[1]</a></sup>. It ships with a **1M token context window** in beta, leads all Sonnet models on OSWorld computer use, and developed a novel business strategy on Vending-Bench Arena that no prior model had tried. Pricing stays flat at **$3/$15 per million tokens**. This is the biggest gap between capability and price that Anthropic has ever released.

There's a story Anthropic has been telling for two years: that intelligence should cascade down the model family, not stay locked at the frontier tier. Every Sonnet release is a test of whether that story is true. Claude Sonnet 4.6 is the first time it feels genuinely proven.

Twelve days ago, Opus 4.6 launched as the most capable model in Anthropic's history - the model that dethroned incumbents on agentic benchmarks, rewrote enterprise market share numbers, and commanded $5/$25 per million tokens as the price of admission. Today, Sonnet 4.6 does a significant portion of what Opus does, at 60% less cost, and it's now free for every user on every plan.

That's not a product update. That's a statement about where the frontier actually lives.

> **Why This Matters Now**
>
> Sonnet 4.6 is now the **default model** across all Claude plans, Claude Cowork, Claude Code, and the API - including the free tier. Anthropic upgraded free users to Sonnet 4.6 and added file creation, connectors, skills, and compaction to that tier simultaneously[1]. The gap between "free Claude" and "Claude that matters" just got a lot smaller.


## The Preference Numbers That Should Worry Anthropic's Pricing Team

Anthropic ran head-to-head preference evaluations in Claude Code - one of the most demanding real-world environments for a model, where errors compound over long sessions and instruction following is tested repeatedly across a single context.

The results are stark.

- **70%**: Preferred over Sonnet 4.5
- **59%**: Preferred over Opus 4.5
- **1M tokens**: Context Window
- **$3/$15**: Pricing
- **16 months**: OSWorld improvement
- **Opus-level**: Prompt injection resistance


A 59% preference rate over Opus 4.5 means that for a majority of coding tasks, users actively chose the cheaper model when given both options blind. This isn't just a benchmark win. It's users voting with their attention on real work.

The specific complaints about Sonnet 4.5 that Sonnet 4.6 addresses are worth reading closely: overengineering, laziness, false claims of task completion, hallucinations in long sessions, poor instruction following on multi-step tasks<sup><a href="#source-1">[1]</a></sup>. These aren't edge case failures. They're the core failure modes of every LLM that gets deployed in production. Sonnet 4.6 apparently fixed most of them. At Sonnet pricing.


"Performance that would have previously required reaching for an Opus-class model is now available with Sonnet 4.6."
- Anthropic, February 17, 2026


## Computer Use: From Experimental to Actually Useful

In October 2024, Anthropic launched computer use and called it "still experimental - at times cumbersome and error-prone." That was honest. It was also a 16-month countdown to what Sonnet 4.6 delivers today.

The OSWorld benchmark tests models on real software - Chrome, LibreOffice, VS Code, and more - running on a simulated computer. No special APIs. No purpose-built connectors. The model sees a screen and interacts with it the way a person would: clicking a mouse, typing on a keyboard.


The practical milestone isn't the benchmark number - it's the report from early users. They're seeing **human-level capability** on tasks like navigating complex spreadsheets and filling out multi-step web forms across multiple browser tabs<sup><a href="#source-1">[1]</a></sup>. That's the inflection point. Not "better than before." Human-level on specific, economically valuable categories of tasks.

The prompt injection improvement is equally significant. Malicious content embedded in webpages - the core attack vector for any computer-using AI - is now handled at Opus 4.6-level resistance. A model that can use computers but can be hijacked by any website it visits is not a deployable product. Sonnet 4.6 closes that gap.

**Opus-level** — Prompt injection resistance vs. Sonnet 4.5


## The Vending-Bench Strategy That No Model Had Tried Before

Vending-Bench Arena is a long-horizon planning benchmark that puts models in charge of a simulated business over time, with direct competition between AI models measured by profitability. It's the closest thing to a real-world test of strategic reasoning that exists in the benchmark ecosystem.

Sonnet 4.6 didn't just win. It developed a strategy that no prior model had used.

Most models optimize for short-term profit from the start - a reasonable heuristic when you don't know how long the game runs. Sonnet 4.6 took a different approach: it invested aggressively in capacity for the first ten simulated months, spending significantly more than its competitors, absorbing a profitability deficit - and then pivoted sharply to maximize returns in the final stretch<sup><a href="#source-1">[1]</a></sup>.

The timing of that pivot was what won it. Not just the strategy, but knowing when to switch.


"Sonnet 4.6 invested heavily in capacity for the first ten simulated months, spending significantly more than its competitors, and then pivoted sharply to focus on profitability in the final stretch. The timing of this pivot helped it finish well ahead of the competition."
- Anthropic, February 17, 2026


This matters beyond the benchmark. It suggests the 1M context window isn't just storage - Sonnet 4.6 appears to use long context to reason more effectively about sequences of decisions over time. That's the behavior enterprises actually need from agentic models: not just completing one step well, but managing a multi-phase plan coherently across an entire session.

## Benchmarks: Where Sonnet 4.6 Sits in the Current Landscape


## The 1M Context Window Is Bigger Than It Sounds

Every recent frontier model advertises a large context window. The number that actually matters is retrieval accuracy within that context - whether the model can actually find and use information buried deep in a long document.

Opus 4.6 showed that Anthropic can build a model that scores 76% on MRCR v2 at 1M tokens, while a competitor's 2M-token model scored 26.3% on the same test. Sonnet 4.6 brings the same 1M context window to a model that costs 40% less.

What does 1M tokens actually hold? Anthropic puts it plainly: entire codebases, lengthy contracts, or dozens of research papers in a single request<sup><a href="#source-1">[1]</a></sup>. For engineering teams, that means asking questions about a full repository without chunking. For legal and financial teams, that means feeding an entire contract alongside precedents without losing context between documents.

The Vending-Bench result suggests this isn't just theoretical - the long context appears to enable qualitatively different reasoning about long-horizon tasks, not just longer storage of facts.

## What's New on the Platform

Sonnet 4.6 ships with a full set of platform updates that extend beyond the model itself:

**Claude Developer Platform:**
- Extended thinking and adaptive thinking both supported
- Context compaction in beta: automatically summarizes older context as conversations approach limits - effective context length extends beyond 1M for long-running sessions

**API tools (now GA):**
- Code execution
- Memory
- Programmatic tool calling
- Tool search
- Tool use examples
- Web search and fetch now auto-write and execute code to filter and process results - keeping only relevant content in context, improving both response quality and token efficiency

**Claude in Excel:**
- Now supports MCP connectors: S&P Global, LSEG, Daloopa, PitchBook, Moody's, FactSet
- If you've set up MCP connectors in Claude.ai, they work automatically in Excel
- Available on Pro, Max, Team, and Enterprise plans

## Where Opus 4.6 Still Wins

Anthropic is explicit about this, which is refreshing. Sonnet 4.6 is not a replacement for Opus 4.6 across the board. The recommendation holds for specific task categories:

- **Deepest reasoning** - problems where getting it exactly right is paramount, not just approximately right
- **Codebase refactoring** - large, interconnected changes where a single error invalidates a full session's work
- **Coordinating multiple agents** - Opus 4.6's Agent Teams capability, where multiple Claude instances collaborate autonomously, remains in a separate tier
- **Highest-stakes decisions** - any task where the cost of failure exceeds the cost of the premium model

For everything else - the large middle of the enterprise AI workload - Sonnet 4.6 is now the answer.

## The Broader Shift This Represents

Two years ago, the AI model stack worked like enterprise software: expensive frontier at the top, a capability cliff below it, and a huge premium for the best version. Anthropic has spent those two years systematically collapsing that cliff.

Sonnet 4.6 is the clearest evidence yet that it's working. A 59% preference rate over a model that was Anthropic's frontier just three months ago isn't incremental progress. It's what happens when the learnings from building Opus filter down into the next tier faster than any prior model generation cycle has managed.


For most teams that defaulted to Opus for serious work: you now have to justify that choice again. The old heuristic - "use Opus when it matters, Sonnet when it doesn't" - no longer maps cleanly to the capability gap between them.

That's a genuinely unusual position for Anthropic's own pricing to be in. And it's almost certainly intentional.

---

*Source: [LLM Rumors](https://www.llmrumors.com/news/claude-sonnet-46-opus-level-intelligence-sonnet-price)*