🔮

This chapter exists because operators keep getting burned. Not by bad models — by building their entire stack on one vendor's consumer subscription, then scrambling when the rules change overnight. This is the guide I wish existed before April 4th.

The Lie You'll Hear First

"Just use [Model X] for everything."

You'll hear this from Twitter threads, YouTube thumbnails, and AI newsletters. It's wrong. Not because Model X is bad, but because "everything" is the problem.

An operator who sends morning summaries, runs parallel coding agents, replies to customers on Slack, triages Sentry alerts, and writes marketing copy doesn't have one job. They have five jobs with completely different performance, cost, and latency requirements. Routing all of them through one model is like hiring a surgeon to also answer phones, stock shelves, and mop floors.

The first serious thing you learn as an operator: route by job, not by loyalty.

The Models That Actually Matter Right Now

These are the frontier models available via API as of April 2026, with real pricing pulled from official docs. Not vibes, not benchmarks someone screenshotted — the numbers you'll see on your invoice.

Anthropic Claude

Model	Input / Output (per MTok)	Context	Max Output	Best For
Claude Opus 4.6	$5 / $25	1M tokens	128K	Complex agentic work, deep reasoning, coding architecture
Claude Sonnet 4.6	$3 / $15	1M tokens	64K	Best speed/intelligence balance, daily operator work
Claude Haiku 4.5	$1 / $5	200K tokens	64K	Fast cheap work, summaries, classification

Opus 4.6 is Anthropic's strongest broadly available model. Sonnet 4.6's training data goes through January 2026 — it actually knows more recent events than Opus. Both support extended thinking and adaptive thinking.

OpenAI GPT

Model	Input / Output (per MTok)	Context	Best For
GPT-5.4	$2.50 / $15 (short) · $5 / $22.50 (long)	—	Complex reasoning, coding, flagship tasks
GPT-5.4 mini	$0.75 / $4.50	400K tokens	High-throughput work at lower cost
GPT-5.4 nano	$0.20 / $1.25	—	Edge/embedded, cheapest possible
GPT-5.4 pro	$30 / $180	—	Maximum capability, premium pricing

GPT-5.4 released March 5, 2026. OpenAI's pricing page separates short and long context, and cached input can be much cheaper. For coding, treat openai-codex/gpt-5.4 as a separate lane instead of assuming the direct API and Codex subscription behave the same.

Google Gemini

Model	Input / Output (per MTok)	Context	Best For
Gemini 3.1 Pro preview	$2 / $12 (≤200K) · $4 / $18 (>200K)	1M tokens	Strongest Google reasoning, multimodal
Gemini 2.5 Pro	$1.25 / $10 (≤200K)	1M tokens	Proven, stable, good value
Gemini 2.5 Flash	$0.30 / $2.50	1M tokens	Fast daily work, great cost ratio
Gemini 2.5 Flash-Lite	$0.10 / $0.40	—	Cheapest serious model on the market

Gemini is natively multimodal — text, images, video, audio, code in the same model. If your workflow involves screenshots, image generation, or video understanding, Google should be your media lane. It also supports context caching, but the official docs price that by model, token count, and storage duration, not one universal savings number.

Also in the Full Chapter

📊

The full playbook chapter also covers: benchmark caveats, the Grok 4.20 vs Grok 4.1 Fast pricing split, DeepSeek's public V3.2 line, and the full Routing Table for default ops, coding, bulk work, media, and budget lanes.

What Actually Happened With Anthropic and OpenClaw

This matters because it changed how every serious operator thinks about billing-path risk and fallback planning.

On April 4, 2026, Anthropic emailed Claude subscribers that effective immediately, they would "no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw." Instead, this path would require "Extra Usage" — pay-as-you-go billing separate from the subscription.

Boris Cherny, Anthropic's head of Claude Code, stated on X that Anthropic's "subscriptions weren't built for the usage patterns of these third-party tools." OpenClaw creator Peter Steinberger said he and OpenClaw board member Dave Morin "tried to talk sense into Anthropic" but could only delay the pricing change by a week.

What Anthropic did NOT do:

They did not ban OpenClaw outright
They did not block the Anthropic API key path
Claude models are still fully available through the standard Anthropic API

What broke was the assumption that a consumer Claude subscription could power production-grade agent workflows through third-party tools indefinitely. That assumption was always fragile. On April 4, it became officially unsupported without Extra Usage.

⚡

The lesson every operator should internalize: If your whole stack depends on one vendor's consumer subscription remaining compatible with your third-party tooling, you don't have a production setup. You have a convenience that's borrowing time. Use the documented, explicit path.

📖

The full chapter includes: The Routing Table: What to Use for What, The Fallback Rule: Why You Need More Than One Provider, Five Rules That Survive Model Churn, and a benchmark section that only quotes numbers when the vendor actually published them.

The Model Landscape

The full file gives you the rest of the operating system, not just the model lane.