MiniMax M2.1 Highspeed

minimax

The highspeed variant of MiniMax M2.1, delivering the same polyglot code mastery and precision refactoring at significantly faster inference speeds (~100 tokens per second vs ~60 tps standard). Ideal for latency-sensitive applications and real-time coding assistance. Shares the same 200K shared context window and 131K max output. Superseded by MiniMax M2.5 Highspeed.

Try Now

Capabilities

Tool Use

Extended Thinking

Example Use Cases

Need low-latency code generation

Real-time coding assistance

Latency-sensitive agentic workflows

Technical Specifications

Context Window

204,800 tokens

Max Output

131,072 tokens

Cache Miss Cost

$0.60 per 1M tokens

Non-Reasoning Cost

$2.40 per 1M tokens

Cache Read Cost

$0.03 per 1M tokens

Cache Write Cost

$0.375 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Superseded by MiniMax M2.5 Highspeed

Recommended Replacement

MiniMax M2.5 Highspeed