MiniMax M2.5 Highspeed

MiniMax

The highspeed variant of MiniMax M2.5, delivering the same SOTA coding performance at significantly faster inference speeds (~100 tokens per second vs ~60 tps standard). Same quality as M2.5 for full-stack development across all platforms with dramatically lower latency. Shares the same 200K shared context window and 131K max output. Ideal for real-time coding assistance and latency-sensitive production deployments.

Try Now

Capabilities

Thinking

Tool Use

Technical Specifications

Context Window

204,800 tokens

Max Output

131,072 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.60

Non-Reasoning Output

$2.40

Cache Read Input

$0.03

Cache Write Input

$0.375

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on

Reason

Outdated model

Recommended Replacement

MiniMax M2.7 Highspeed