DeepHermes 3 Mistral 24B

Nous Research

DeepHermes 3 (Mistral 24B Preview) is an instruction-tuned language model by Nous Research based on Mistral-Small-24B, designed for chat, function calling, and advanced multi-turn reasoning. It introduces a dual-mode system that toggles between intuitive chat responses and structured "deep reasoning" mode using special system prompts. Fine-tuned via distillation from R1, it supports structured output (JSON mode) and function call syntax for agent-based applications. DeepHermes 3 supports a reasoning toggle via system prompt, allowing users to switch between fast, intuitive responses and deliberate, multi-step reasoning. When activated with a specific system instruction, the model enters a deep thinking mode, generating extended chains of thought wrapped in `<think></think>` tags before delivering a final answer.

Try Now

Capabilities

Thinking

Tool Use

Technical Specifications

Context Window

32,768 tokens

Max Output

32,768 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.02

Non-Reasoning Output

$0.10

Cache Read Input

$0.01

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Retired

Made legacy on

Reason

24B Mistral fine-tune; base model superseded

Recommended Replacement

Qwen3.6 Plus

Retired on