GLM 4.5V

zai

A visual reasoning model based on the MoE architecture with 106B total parameters and 12B active. Achieves state-of-the-art performance among open-source VLMs of its scale across image, video, document understanding, and GUI tasks. Features a flexible thinking mode toggle for balancing speed and reasoning depth. Excels at webpage code generation from screenshots, object detection, document parsing, and long video analysis.

Try Now

Capabilities

Tool Use

Extended Thinking

Image Input

PDF Input

Example Use Cases

Visual reasoning with open-source VLM

Image and video understanding with thinking

Document analysis and GUI tasks

Technical Specifications

Context Window

64,000 tokens

Max Output

16,000 tokens

Cache Miss Cost

$0.60 per 1M tokens

Non-Reasoning Cost

$1.80 per 1M tokens

Cache Read Cost

$0.11 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Superseded by GLM 5

Recommended Replacement

GLM 5