Qwen3 4B

Alibaba

This qwen3 hybrid reasoning model enables seamless switching between thinking and non-thinking modes during conversations, achieving SOTA reasoning performance at its scale. It shows significant improvements in human preference alignment, creative writing, role-playing, multi-turn dialogue, and instruction following, delivering a greatly enhanced user experience.

Try Now

Capabilities

Thinking

Technical Specifications

Context Window

131,072 tokens

Max Output

8,192 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.11

Non-Reasoning Output

$0.42

Reasoning Output

$1.26

Legacy

Made legacy on

Reason

4B model; too small for reliable chat

Recommended Replacement

Qwen3.6 Plus