Trinity Mini

arcee

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model, engineered for efficient inference over long contexts with robust function calling and multi-step agent workflows. With 128K context, it delivers an outstanding price-to-performance ratio while maintaining coherent multi-turn reasoning and reliable tool use. Ideal for production deployments where speed and cost efficiency are paramount.

Try Now

Capabilities

Tool Use

Example Use Cases

Fast inference on a budget

Function calling and agent workflows

Long-context processing with minimal compute

Technical Specifications

Context Window

131,072 tokens

Max Output

131,072 tokens

Cache Miss Cost

$0.045 per 1M tokens

Non-Reasoning Cost

$0.15 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3 Max