Claude Opus 4.6 Fast

Anthropic

The fast-inference variant of Claude Opus 4.6, delivering identical model quality at up to 2.5x faster output tokens per second for latency-sensitive agentic workflows. Same weights, same capabilities, same 1M context window, and same 128K max output as Opus 4.6, but with a faster inference configuration. Priced at 6x standard Opus 4.6 rates (prompt cache and data residency multipliers stack on top). Speed gains are focused on output tokens per second, not time to first token. Not available with Batch API or Priority Tier. Superseded by Opus 4.7.

Try Now

Capabilities

Thinking

Tool Use

Image Input

PDF Input

Technical Specifications

Context Window

1,000,000 tokens

Max Output

128,000 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$30

Non-Reasoning Output

$150

Cache Read Input

$3

Cache Write Input

$37.50

Tool Costs (per 1K calls)

Web Search

$10

Code Execution

$0.0694

Legacy

Made legacy on

Reason

Superseded by Opus 4.7 with stronger agentic coding, adaptive-only thinking, and high-resolution vision

Recommended Replacement

Claude Opus 4.7