DeepHermes 3 (Mistral 24B Preview) is an instruction-tuned language model by Nous Research based on Mistral-Small-24B, designed for chat, function calling, and advanced multi-turn reasoning. It introduces a dual-mode system that toggles between intuitive chat responses and structured “deep reasoning” mode using special system prompts. Fine-tuned via distillation from R1, it supports structured output (JSON mode) and function call syntax for agent-based applications. DeepHermes 3 supports a reasoning toggle via system prompt, allowing users to switch between fast, intuitive responses and deliberate, multi-step reasoning. When activated with a specific system instruction, the model enters a deep thinking mode, generating extended chains of thought wrapped in `<think></think>` tags before delivering a final answer.
Try NowToggleable reasoning with tool use
Structured json output on a budget
Agent-based function calling
32,768 tokens
32,768 tokens
$0.02 per 1M tokens
$0.10 per 1M tokens
$0.01 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls