Qwen3 VL Flash

Alibaba

The Qwen3 series of small-sized visual understanding models effectively integrates thinking and non-thinking modes. Compared with the snapshot taken on October 15, 2025, the overall performance of the model has improved significantly: it delivers enhanced capabilities in general visual recognition and reasoning, and shows marked improvements in recognition accuracy across various business scenarios such as security, in-store inspections, equipment monitoring, and photo-based problem solving. This version is a snapshot as of January 22, 2026.

Try Now

Capabilities

Tool Use

Image Input

Technical Specifications

Context Window

262,144 tokens

Max Output

32,768 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input
≤ 32,000 input$0.05
≤ 128,000 input$0.075
> 128,000 input$0.12
Non-Reasoning Output
≤ 32,000 input$0.40
≤ 128,000 input$0.60
> 128,000 input$0.96

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on

Reason

Fast VL variant; superseded by Qwen3 VL 235B

Recommended Replacement

Qwen3 Max