Claude Opus 4.6: The Model That Tanked the Stock Market and Now Wants Your Job
Claude Opus 4.6: The Model That Tanked the Stock Market and Now Wants Your Job
Something happened this week that we haven’t seen before: an AI model update triggered a genuine stock market panic. Not a one-day scare like DeepSeek. A $285 billion wipeout in market cap in a single session. Thomson Reuters dropped 15.8%. LegalZoom fell 19.7%. RELX, parent company of LexisNexis, shed 14%. Salesforce, FactSet, Workday — all deep red.
The trigger wasn’t Opus 4.6. It was what came right before: Claude Cowork plugins.
And now, with Opus 4.6 fresh off the press, the question is no longer whether AI will change knowledge work. It’s when and how much.
Let’s break it down.
What is Claude Opus 4.6
Claude Opus 4.6 is Anthropic’s most advanced model, released on February 5, 2026. It’s a direct upgrade from Opus 4.5 (shipped last November) and belongs to the Claude 4.5 family, which also includes Sonnet 4.5 and Haiku 4.5.
It’s not a new model built from scratch. Same brain, better tuned. It thinks longer before answering, plans multi-step tasks better, and most importantly: it doesn’t lose the thread in massive conversations.
Key improvements
1 million token context (beta). This is new for the Opus tier. For perspective: 1 million tokens is roughly 750,000 words — about 10 full novels in a single conversation. The previous model topped out at 200K. This means you can feed it an entire codebase, or a full legal dossier, and it’ll process it without losing track.
And it’s not just about “accepting” more text. On MRCR v2 — a benchmark that measures the ability to retrieve information buried deep in massive contexts — Opus 4.6 scores 76%. Sonnet 4.5 scores 18.5%. This isn’t incremental improvement; it’s a qualitative leap in how much context the model can actually use without degrading.
Better at code, more agentic. The model leads Terminal-Bench 2.0 (agentic coding benchmark), is better at reviewing and debugging its own code, and sustains quality over longer work sessions. SentinelOne reported it handled a million-line codebase migration “like a senior engineer”, planning the strategy and adapting on the fly.
Better at knowledge work. On GDPval-AA — a benchmark measuring real-world performance in finance, legal, and other domains — Opus 4.6 beats OpenAI’s GPT-5.2 by 144 Elo points and its own predecessor Opus 4.5 by 190 points. It’s also the top model on BrowseComp, which evaluates the ability to find hard-to-locate information on the web.
Deeper reasoning. The model thinks more. Sometimes too much — Anthropic recommends lowering the effort setting (/effort) to medium if it takes too long on simple questions. But for complex problems, this makes a real difference. It leads Humanity’s Last Exam, a multidisciplinary test designed to evaluate expert-level reasoning.
Vulnerability discovery. Before launch, Anthropic’s red team gave it security analysis tools and turned it loose on open-source code. It found over 500 previously unknown zero-day vulnerabilities, validated by external researchers. Bugs in GhostScript, OpenSC, CGIF — stuff that traditional fuzzers missed. This is a double-edged sword: incredible for defenders, concerning if it falls into the wrong hands.
What’s changed in the user experience
This is where things get interesting for those of us who use Claude daily. Because Opus 4.6 isn’t just a better model — it ships with interface changes and new tools that fundamentally change how you work with it.
Message composition: the editable canvas
When you ask Claude to draft an email, Slack message, or any text, it no longer just dumps it into the chat. It presents it in an editable canvas — you can modify it directly before copying or sending. For emails, there’s a button to open it straight in Gmail.
This seems minor, but it changes the workflow. Before: Claude writes → you copy → paste into Gmail → edit. Now: Claude writes → you edit in place → send. One less step, less friction.
It can also generate multiple versions of the same message with different approaches. Need help with a tricky email? It’ll give you a direct version and a diplomatic one, each labeled. You pick whichever fits.
Interactive questions instead of text walls
Another new feature: when Claude needs more context or wants to help you decide, instead of writing a paragraph with options, it shows an interactive modal with questions and clickable choices. If you’re choosing between options, it presents them as buttons instead of a numbered list.
This speeds up back-and-forth conversations significantly. Instead of “type 1, 2, or 3”, you click and move on.
Native file creation
Claude could already create documents, but now it’s considerably more capable: it generates Word (.docx), PowerPoint (.pptx), Excel (.xlsx), PDFs, and executable code. For those of us working with data, being able to say “build me an Excel with this structure” or “put together a PowerPoint with these numbers” and get a downloadable file is a major step forward.
Speaking of PowerPoint: Claude in PowerPoint launches in preview. You can work directly inside PowerPoint with Claude as a sidebar assistant, creating and editing slides without leaving the application. It reads your layouts, fonts, and templates, generating content that respects your corporate design.
Agent Teams
This is Claude Code only for now, but it’s significant: instead of having a single agent working sequentially, you can spin up a team of agents that split tasks and work in parallel. One on the frontend, another on the API, another on the migration. Coordinating with each other.
Scott White, Anthropic’s Head of Product, compares it to having a team of competent humans working for you. The difference is these ones don’t take coffee breaks or get distracted by Slack.
Compaction and adaptive thinking
Two more technical but relevant improvements:
Compaction: The model can summarize its own context to keep working on long tasks without hitting token limits. Think of it as the model taking notes on what it’s already done so it doesn’t have to re-read everything.
Adaptive thinking: The model automatically adjusts how much it “thinks” based on task complexity. For something simple, it’s fast. For something complex, it takes its time. This balances cost, latency, and quality without you having to configure anything.
Cowork and the “SaaSpocalypse”
But none of this caused the stock market panic. What did was Cowork plugins.
Cowork launched on January 12 as a preview. The concept is simple: you give Claude access to a folder on your desktop and it reads, edits, and creates files. Like an intern with superpowers. On January 30, Anthropic added industry-specific plugins: legal, sales, finance, marketing, data, customer support.
The legal plugin is what broke everything. It can review contracts, triage NDAs, track compliance. All configurable for your organization. Anthropic is clear that results should be reviewed by licensed attorneys. But Wall Street didn’t stop to read the fine print.
On February 3, the market cratered. Thomson Reuters lost 15.8% in a single day — its worst drop ever. LegalZoom fell 19.7%. RELX shed 14%. A software ETF had its worst session since April. Bloomberg reports $285 billion evaporated. They’re already calling it the “SaaSpocalypse”.
The market’s logic: if a general-purpose AI model can do what specialized software costing thousands per year does, why pay for that software?
Jensen Huang from Nvidia calls it “illogical”. A JP Morgan analyst calls it “a logical leap”. Others point out that when DeepSeek shook the market, Nvidia lost $600B and then recovered to reach $5 trillion.
But there’s something different this time: it’s not just hype. Cowork plugins do real things, right now, for $100/month. A junior lawyer costs considerably more than that.
Pricing and plans
Model pricing hasn’t changed: $5 per million input tokens, $25 output on the API. Prompt caching saves up to 90%, batch processing 50%.
For consumers:
- Free: Access to Sonnet 4.5 with strict limits. Around 30-100 messages per day.
- Pro ($20/mo): Access to all models (Opus 4.6, Sonnet 4.5, Haiku 4.5), Google Workspace, Claude Code, 5x more usage than free.
- Max 5x ($100/mo): 5 times more usage than Pro. Cowork included.
- Max 20x ($200/mo): 20 times more. For people who use Claude all day.
- Team ($25-30/seat/mo): For teams of 5+. Centralized admin, shared projects.
- Team Premium ($150/seat/mo): Includes Claude Code.
- Enterprise: Custom pricing. SSO, SCIM, audit logs, 500K context, compliance.
The API model string is claude-opus-4-6. Also available on AWS Bedrock, Google Vertex AI, and Microsoft Foundry.
My take: what actually matters
I’ve been using Claude for months to write this blog, to work with data, to code. I’ve watched the evolution from Sonnet 3.5 to here. And here’s what I see:
The leap isn’t in the benchmarks. It’s in consistency. In being able to hand it a complex task and having it finish without micromanagement. In maintaining context during long sessions without degrading. In generating files that are actually usable, not “almost right”.
The editable message canvas and question modals are UX details that seem minor but change how you work. Less copy-pasting, less back-and-forth, more flow.
Agent teams are the future, though right now they’re in preview and code-only. When this reaches general knowledge work, the impact will be massive.
And the “SaaSpocalypse”… Look, the market overreacts. Always has. But the direction is clear: general-purpose models are eating into specialized software territory. Not tomorrow, not everything, but steadily. If your business sells software that does one specific thing an LLM can already do, you have a problem. Not today. But in 2-3 years.
What I find most revealing is the vulnerability discovery. Over 500 zero-days found out of the box, without specific instructions. That’s not a model answering questions. That’s a model investigating, reasoning, and discovering things that humans with traditional tools missed. For better and for worse.
Is Opus 4.6 worth it?
If you use Claude for serious work — code, data analysis, documentation, research — yes. It’s better than Opus 4.5 in everything that matters. The 1M token context is a game-changer for large codebases or lengthy documents.
If you’re a casual user, Sonnet 4.5 is still excellent and faster. Opus 4.6 thinks more, which means it sometimes takes longer. For simple questions, it’s like bringing a bazooka to a knife fight.
If you’re a developer, the API pricing is competitive and the new features (compaction, effort control, adaptive thinking) give you more control over the cost-quality tradeoff.
And if you’re a specialized software company selling subscriptions for thousands per year… it might be a good time to rethink your business model.
February 2026
This article was written with Claude Opus 4.6. Yes, the very model I’m writing about. No, I didn’t ask it to toot its own horn. Data comes from Anthropic, CNBC, TechCrunch, Bloomberg, Axios, CNN, VentureBeat, and Fast Company. The opinions are mine.
You might also like
You have 3-5 years before AI agents become normal
78% of executives believe digital ecosystems will be built for humans AND AI agents. The window to position yourself is closing.
DeepSeek: what it is and how to use it for free
Complete DeepSeek guide for 2026. Free ChatGPT alternative: web, mobile app, and local installation with Ollama.
OpenClaw: The Viral AI Assistant I'm Not Installing (Yet)
42,000 exposed instances, critical vulnerabilities, and three rebrandings in three months. Why the most hyped project of the moment gives me more pause than excitement.