MiniMax M2.7 Highspeed

MiniMax

The highspeed variant of MiniMax M2.7, delivering the same agentic and coding performance at significantly faster inference speeds (~100 tokens per second vs ~60 tps standard). Unlike previous generations, M2.7 Highspeed matches the standard variant on pricing while delivering dramatically lower latency. Shares the same 200K shared context window and 131K max output.

Try Now

Capabilities

Thinking

Tool Use

Example Use Cases

Low-latency agentic workflows

Real-time coding assistance

Latency-sensitive production deployments

Technical Specifications

Context Window

204,800 tokens

Max Output

131,072 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.30

Non-Reasoning Output

$1.20

Cache Read Input

$0.06

Cache Write Input

$0.375

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19