A Mixture-of-Experts model with 230B total parameters and only 10B activated per inference, delivering exceptional efficiency. Built for the agentic era with function calling, advanced reasoning, and real-time streaming capabilities. With a 200K shared context window and 128K max output (including chain-of-thought), it handles massive contexts for coding and agentic work. Superseded by MiniMax M2.1 with improved coding and refactoring capabilities.
Try NowAgentic workflows with function calling
Coding at scale with large context
Need efficient model with advanced reasoning
204,800 tokens
128,000 tokens
$0.30
$1.20
$0.03
$0.375
$15
$0.19