Pixtral 12B

Mistral

A compact 12B multimodal model with image understanding alongside text capabilities.

Try Now

Capabilities

Tool Use

Image Input

PDF Input

Example Use Cases

Image understanding at low cost

Multimodal task with budget constraints

Lightweight vision and text task

Technical Specifications

Context Window

128,000 tokens

Max Output

128,000 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.15

Non-Reasoning Output

$0.15

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3.5 Plus