1M Context in Claude - a Hands-On Breakdown

Sergey Golubev 2026-03-14 3 min read
🌐 Читать на русском

Anthropic shipped 1M context for Opus 4.6 and Sonnet 4.6. Default, no long-context surcharge. Available now in Claude Code on Max, Team, and Enterprise plans.

Why 200K wasn’t enough

With 200K, you actually get ~167K usable tokens (per claudefa.st estimates). Fine for a small project with no gymnastics. One service, a couple of files.

The real problem was long sessions. Compaction kicks in at ~83.5% window fill (claudefa.st data). During active work, that’s every 20-30 minutes. Details vanish. You re-explain context from scratch.

Classic scenario: Claude Code burns 100K+ tokens just searching through Datadog, Braintrust, databases. Then compaction hits. Details are gone. You’re debugging in circles.

What actually changed

Entire monorepo at once. Per claudefa.st, ~830K working tokens available out of 1M. Full codebase + docs + tests, all at once.

Long sessions without memory loss. 15% fewer compaction events (claudefa.st data). Compaction is token-based, not time-based - you can leave a session overnight and the agent remembers everything in the morning.

Large PR reviews. Adhyyan Sekhsaria from Cognition described the problem: “Large diffs didn’t fit in 200K… the agent would split context and lose cross-file dependencies.” With 1M - single pass. You see how a change in one file breaks an interface in another.

Documents. 600-page PDFs instead of 100. Five versions of a 100-page agreement in one session.

Nuances

MRCR v2 (long-context retrieval benchmark) - 78.3%. Best among frontier models, but not 100%. Not sure yet how critical that is in daily work - still testing.

Pricing: Opus $5/$25 per million tokens. Sonnet $3/$15. No long-context surcharge. I went with Opus for complex tasks, Sonnet for routine - the price difference is 1.7x, but Opus makes fewer mistakes on long context. Izzy Miller, AI Engineer at Hex, spotted an interesting paradox: “Raised the window from 200K to 500K - the agent became more efficient and uses fewer total tokens.”

One limitation: 1M isn’t available in claude.ai chats yet. API and Claude Code only. Cursor hasn’t updated either.

1M was available in beta - tested it, worked well on my tasks. But there was a long-context surcharge and a beta-header requirement. GA removed both - now it’s just the default.

What I took away

The main thing: I no longer have to be so conservative with context or stress about which files to include. You can just throw more in. But that doesn’t mean you should burn through the full million - keeping context controlled still matters.

Sources

  1. 1M Context GA - Anthropic Blog
  2. ClaudeFast Guide: 1M Context
  3. InfoQ: Opus 4.6 Context Compaction
  4. Reddit Discussion
  5. Siskar Analysis: 1 Trillion Token Context