The Qwen3 series of small-sized visual understanding models effectively integrates thinking and non-thinking modes. Compared with the snapshot taken on October 15, 2025, the overall performance of the model has improved significantly: it delivers enhanced capabilities in general visual recognition and reasoning, and shows marked improvements in recognition accuracy across various business scenarios such as security, in-store inspections, equipment monitoring, and photo-based problem solving. This version is a snapshot as of January 22, 2026.
Try NowFast cheap image understanding
High-volume visual tasks
Budget visual recognition or inspection
262,144 tokens
32,768 tokens
$0.05 per 1M tokens
$0.40 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls