The "Thinking" edition of Qwen3-VL's second-largest MoE model offers fast response, enhanced multimodal understanding and reasoning, visual agent capabilities, and ultra-long context support (e.g., long videos and documents). It improves image/video comprehension, spatial perception, and object recognition to handle complex real-world tasks.
Try Now131,072 tokens
32,768 tokens
$0.20
$2.40
$15
$0.19
30B VL thinking; superseded by Qwen3 VL 235B thinking