The fast-inference variant of Claude Opus 4.6, delivering identical model quality at up to 2.5x faster output tokens per second for latency-sensitive agentic workflows. Same weights, same capabilities, same 1M context window, and same 128K max output as Opus 4.6, but with a faster inference configuration. Priced at 6x standard Opus 4.6 rates (prompt cache and data residency multipliers stack on top). Speed gains are focused on output tokens per second, not time to first token. Not available with Batch API or Priority Tier. Superseded by Opus 4.7.
Try Now1,000,000 tokens
128,000 tokens
$30
$150
$3
$37.50
$10
$0.0694
Superseded by Opus 4.7 with stronger agentic coding, adaptive-only thinking, and high-resolution vision