GLM 4.5 AirX

Zai

The high-speed variant of GLM-4.5-Air, delivering ultra-fast response times while maintaining strong performance. With 106B total parameters and 12B active per forward pass, it combines the efficiency of the Air architecture with optimized inference speed exceeding 100 tokens per second. Ideal for low-latency production deployments where speed matters alongside intelligent agent capabilities.

Try Now

Capabilities

Thinking

Tool Use

Example Use Cases

Need fast glm agent model

Low-latency agentic workflows

Lightweight performance with ultra-fast response

Technical Specifications

Context Window

128,000 tokens

Max Output

96,000 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$1.10

Non-Reasoning Output

$4.50

Cache Read Input

$0.22

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on

Reason

Superseded by GLM 5

Recommended Replacement

GLM 5.1