Qwen Flash

Alibaba

The Qwen3 Flash model (snapshot 2025-07-28) offers a powerful fusion of thinking and non-thinking modes with dynamic in-conversation switching, excelling in complex reasoning while showing significant gains in instruction following and text comprehension. It supports a 1M context length and is billed on a tiered model corresponding to context usage.

Try Now

Capabilities

Thinking

Tool Use

Technical Specifications

Context Window

1,000,000 tokens

Max Output

32,768 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input
≤ 256,000 input$0.05
> 256,000 input$0.25
Non-Reasoning Output
≤ 256,000 input$0.40
> 256,000 input$2

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on

Reason

Old proprietary model; superseded by Qwen3 Max

Recommended Replacement

Qwen3 Max