MiniMax M2.5 Highspeed

minimax

The highspeed variant of MiniMax M2.5, delivering the same SOTA coding performance at significantly faster inference speeds (~100 tokens per second vs ~60 tps standard). Same quality as M2.5 for full-stack development across all platforms with dramatically lower latency. Shares the same 200K shared context window and 131K max output. Ideal for real-time coding assistance and latency-sensitive production deployments.

Try Now

Capabilities

Tool Use

Extended Thinking

Example Use Cases

Need low-latency SOTA code generation

Real-time full-stack development

Latency-sensitive agentic workflows

Technical Specifications

Context Window

204,800 tokens

Max Output

131,072 tokens

Cache Miss Cost

$0.60 per 1M tokens

Non-Reasoning Cost

$2.40 per 1M tokens

Cache Read Cost

$0.03 per 1M tokens

Cache Write Cost

$0.375 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls