The Lie You'll Hear First
"Just use [Model X] for everything."
You'll hear this from Twitter threads, YouTube thumbnails, and AI newsletters. It's wrong. Not because Model X is bad, but because "everything" is the problem.
An operator who sends morning summaries, runs parallel coding agents, replies to customers on Slack, triages Sentry alerts, and writes marketing copy doesn't have one job. They have five jobs with completely different performance, cost, and latency requirements. Routing all of them through one model is like hiring a surgeon to also answer phones, stock shelves, and mop floors.
The first serious thing you learn as an operator: route by job, not by loyalty.
The Models That Actually Matter Right Now
These are the frontier models available via API as of April 2026, with real pricing pulled from official docs. Not vibes, not benchmarks someone screenshotted — the numbers you'll see on your invoice.
Anthropic Claude
| Model | Input / Output (per MTok) | Context | Max Output | Best For |
|---|---|---|---|---|
| Claude Opus 4.6 | $5 / $25 | 1M tokens | 128K | Complex agentic work, deep reasoning, coding architecture |
| Claude Sonnet 4.6 | $3 / $15 | 1M tokens | 64K | Best speed/intelligence balance, daily operator work |
| Claude Haiku 4.5 | $1 / $5 | 200K tokens | 64K | Fast cheap work, summaries, classification |
Opus 4.6 is Anthropic's strongest broadly available model. Sonnet 4.6's training data goes through January 2026 — it actually knows more recent events than Opus. Both support extended thinking and adaptive thinking.
OpenAI GPT
| Model | Input / Output (per MTok) | Context | Best For |
|---|---|---|---|
| GPT-5.4 | $2.50 / $15 (short) · $5 / $22.50 (long) | — | Complex reasoning, coding, flagship tasks |
| GPT-5.4 mini | $0.75 / $4.50 | 400K tokens | High-throughput work at lower cost |
| GPT-5.4 nano | $0.20 / $1.25 | — | Edge/embedded, cheapest possible |
| GPT-5.4 pro | $30 / $180 | — | Maximum capability, premium pricing |
GPT-5.4 released March 5, 2026. OpenAI's pricing page separates short and long context, and cached input can be much cheaper. For coding, treat openai-codex/gpt-5.4 as a separate lane instead of assuming the direct API and Codex subscription behave the same.
Google Gemini
| Model | Input / Output (per MTok) | Context | Best For |
|---|---|---|---|
| Gemini 3.1 Pro preview | $2 / $12 (≤200K) · $4 / $18 (>200K) | 1M tokens | Strongest Google reasoning, multimodal |
| Gemini 2.5 Pro | $1.25 / $10 (≤200K) | 1M tokens | Proven, stable, good value |
| Gemini 2.5 Flash | $0.30 / $2.50 | 1M tokens | Fast daily work, great cost ratio |
| Gemini 2.5 Flash-Lite | $0.10 / $0.40 | — | Cheapest serious model on the market |
Gemini is natively multimodal — text, images, video, audio, code in the same model. If your workflow involves screenshots, image generation, or video understanding, Google should be your media lane. It also supports context caching, but the official docs price that by model, token count, and storage duration, not one universal savings number.
Also in the Full Chapter
The full playbook chapter also covers: benchmark caveats, the Grok 4.20 vs Grok 4.1 Fast pricing split, DeepSeek's public V3.2 line, and the full Routing Table for default ops, coding, bulk work, media, and budget lanes.
What Actually Happened With Anthropic and OpenClaw
This matters because it changed how every serious operator thinks about billing-path risk and fallback planning.
On April 4, 2026, Anthropic emailed Claude subscribers that effective immediately, they would "no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw." Instead, this path would require "Extra Usage" — pay-as-you-go billing separate from the subscription.
Boris Cherny, Anthropic's head of Claude Code, stated on X that Anthropic's "subscriptions weren't built for the usage patterns of these third-party tools." OpenClaw creator Peter Steinberger said he and OpenClaw board member Dave Morin "tried to talk sense into Anthropic" but could only delay the pricing change by a week.
What Anthropic did NOT do:
- They did not ban OpenClaw outright
- They did not block the Anthropic API key path
- Claude models are still fully available through the standard Anthropic API
What broke was the assumption that a consumer Claude subscription could power production-grade agent workflows through third-party tools indefinitely. That assumption was always fragile. On April 4, it became officially unsupported without Extra Usage.
The lesson every operator should internalize: If your whole stack depends on one vendor's consumer subscription remaining compatible with your third-party tooling, you don't have a production setup. You have a convenience that's borrowing time. Use the documented, explicit path.
The full chapter includes: The Routing Table: What to Use for What, The Fallback Rule: Why You Need More Than One Provider, Five Rules That Survive Model Churn, and a benchmark section that only quotes numbers when the vendor actually published them.