ERNIE 4.5 21B A3B

baidu

A sophisticated text-based Mixture-of-Experts (MoE) model featuring 21B total parameters with 3B activated per token, delivering exceptional multimodal understanding and generation through heterogeneous MoE structures and modality-isolated routing. Supporting an extensive 131K token context length, the model achieves efficient inference via multi-expert parallel collaboration and quantization, while advanced post-training techniques including SFT, DPO, and UPO ensure optimized performance across diverse applications with specialized routing and balancing losses for superior task handling.

Try Now

Capabilities

Tool Use

Example Use Cases

Budget chinese-english text task

Efficient moe with tool use

Lightweight general text generation

Technical Specifications

Context Window

120,000 tokens

Max Output

8,000 tokens

Cache Miss Cost

$0.07 per 1M tokens

Non-Reasoning Cost

$0.28 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3.5 Plus