GLM 4.5 Air

Zai

A streamlined, efficient agent-focused model using Mixture-of-Experts architecture. With 106B total parameters but only 12B active per task, this model delivers impressive intelligence while remaining fast and cost-effective. Purpose-built for agentic applications, it excels at tool use and autonomous workflows. The thinking capabilities provide transparency in decision-making. With 128K context and 96K output, it handles substantial tasks comfortably. Perfect for production agent systems where you need reliability and efficiency without breaking the budget.

Try Now

Capabilities

Thinking

Tool Use

Technical Specifications

Context Window

128,000 tokens

Max Output

96,000 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.20

Non-Reasoning Output

$1.10

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on

Reason

Outdated model

Recommended Replacement

GLM 5.1