Qwen Flash

alibaba

The Qwen3 Flash model (snapshot 2025-07-28) offers a powerful fusion of thinking and non-thinking modes with dynamic in-conversation switching, excelling in complex reasoning while showing significant gains in instruction following and text comprehension. It supports a 1M context length and is billed on a tiered model corresponding to context usage.

Try Now

Capabilities

Tool Use

Extended Thinking

Example Use Cases

Need fast cheap alibaba response

Very long context needed with alibaba

Simple to moderate alibaba task

Technical Specifications

Context Window

1,000,000 tokens

Max Output

32,768 tokens

Cache Miss Cost

$0.05 per 1M tokens

Non-Reasoning Cost

$0.40 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3 Max