Voxtral Small

mistral

A small audio understanding model released in July 2025

Try Now

Capabilities

Tool Use

PDF Input

Example Use Cases

Audio understanding with tool use

Speech-to-text with instructions

Audio analysis and processing

Technical Specifications

Context Window

32,768 tokens

Max Output

32,768 tokens

Cache Miss Cost

$0.10 per 1M tokens

Non-Reasoning Cost

$0.30 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3.5 Plus