The highspeed variant of MiniMax M2.5, delivering the same SOTA coding performance at significantly faster inference speeds (~100 tokens per second vs ~60 tps standard). Same quality as M2.5 for full-stack development across all platforms with dramatically lower latency. Shares the same 200K shared context window and 131K max output. Ideal for real-time coding assistance and latency-sensitive production deployments.
Try NowNeed low-latency SOTA code generation
Real-time full-stack development
Latency-sensitive agentic workflows
204,800 tokens
131,072 tokens
$0.60 per 1M tokens
$2.40 per 1M tokens
$0.03 per 1M tokens
$0.375 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls