Qwen3 hybrid reasoning model enables seamless switching between thinking and non-thinking modes during conversations. It achieves SOTA reasoning performance at its scale and significantly outperforms Qwen2.5-7B in general capabilities.
Try NowLocal or budget deployment with reasoning
Efficient model for moderate tasks
Small-scale reasoning or coding
131,072 tokens
8,192 tokens
$0.18 per 1M tokens
$0.70 per 1M tokens
$2.10 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls