Qwen3 VL 32B Instruct

Alibaba

The largest dense model in the Qwen3-VL series, in its non-inference version, delivers overall performance second only to Qwen3-VL-235B-Instruct. It excels in document recognition and comprehension, demonstrates strong spatial awareness and object identification capabilities, and achieves state-of-the-art performance in 2D visual detection and spatial reasoning. It is well-suited for complex perception tasks across a wide range of general-purpose scenarios.

Try Now

Capabilities

Tool Use

Image Input

Technical Specifications

Context Window

131,072 tokens

Max Output

32,768 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.16

Non-Reasoning Output

$0.64

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on

Reason

32B VL; superseded by Qwen3 VL 235B

Recommended Replacement

Qwen3 Max