Qwen3 VL 30B A3B Thinking

alibaba

The "Thinking" edition of Qwen3-VL's second-largest MoE model offers fast response, enhanced multimodal understanding and reasoning, visual agent capabilities, and ultra-long context support (e.g., long videos and documents). It improves image/video comprehension, spatial perception, and object recognition to handle complex real-world tasks.

Try Now

Capabilities

Tool Use

Image Input

Extended Thinking

Example Use Cases

Visual reasoning on a budget

Multimodal agent task with thinking

Efficient image analysis with reasoning

Technical Specifications

Context Window

131,072 tokens

Max Output

32,768 tokens

Cache Miss Cost

$0.20 per 1M tokens

Non-Reasoning Cost

$2.40 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3 Max