The highspeed variant of MiniMax M2.1, delivering the same polyglot code mastery and precision refactoring at significantly faster inference speeds (~100 tokens per second vs ~60 tps standard). Ideal for latency-sensitive applications and real-time coding assistance. Shares the same 200K shared context window and 131K max output. Superseded by MiniMax M2.5 Highspeed.
Try NowNeed low-latency code generation
Real-time coding assistance
Latency-sensitive agentic workflows
204,800 tokens
131,072 tokens
$0.60 per 1M tokens
$2.40 per 1M tokens
$0.03 per 1M tokens
$0.375 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls
Superseded by MiniMax M2.5 Highspeed