MiMo V2.5

Xiaomi

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding.

Try Now

Capabilities

Thinking

Tool Use

Image Input

Technical Specifications

Context Window

1,048,576 tokens

Max Output

131,072 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.40

Non-Reasoning Output

$2

Cache Read Input

$0.08

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19