This qwen3 hybrid reasoning model enables seamless switching between thinking and non-thinking modes during conversations, achieving SOTA reasoning performance at its scale. It shows significant improvements in human preference alignment, creative writing, role-playing, multi-turn dialogue, and instruction following, delivering a greatly enhanced user experience.
Try NowCompact model with surprisingly strong reasoning
Local deployment with thinking capability
Small model for creative writing or dialogue
131,072 tokens
8,192 tokens
$0.11 per 1M tokens
$0.42 per 1M tokens
$1.26 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls