DeepHermes 3 Mistral 24B

nousresearch

DeepHermes 3 (Mistral 24B Preview) is an instruction-tuned language model by Nous Research based on Mistral-Small-24B, designed for chat, function calling, and advanced multi-turn reasoning. It introduces a dual-mode system that toggles between intuitive chat responses and structured “deep reasoning” mode using special system prompts. Fine-tuned via distillation from R1, it supports structured output (JSON mode) and function call syntax for agent-based applications. DeepHermes 3 supports a reasoning toggle via system prompt, allowing users to switch between fast, intuitive responses and deliberate, multi-step reasoning. When activated with a specific system instruction, the model enters a deep thinking mode, generating extended chains of thought wrapped in `<think></think>` tags before delivering a final answer.

Try Now

Capabilities

Tool Use

Extended Thinking

Example Use Cases

Toggleable reasoning with tool use

Structured json output on a budget

Agent-based function calling

Technical Specifications

Context Window

32,768 tokens

Max Output

32,768 tokens

Cache Miss Cost

$0.02 per 1M tokens

Non-Reasoning Cost

$0.10 per 1M tokens

Cache Read Cost

$0.01 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3.5 Plus