The Qwen3.6 native vision-language Plus series models demonstrate exceptional performance on par with the current state-of-the-art models, with a significant improvement in overall results compared to the 3.5 series. The models have been markedly enhanced in code-related capabilities such as agentic coding, front-end programming, and Vibe coding, as well as in multi-modal general object recognition, OCR, and object localization.
Try NowMultimodal vision-language task with alibaba
Agentic coding and front-end programming
Long-context multimodal understanding
1,000,000 tokens
65,536 tokens
$15
$0.19