GLM 4.6V Flash

zai

A completely free multimodal model with native function calling support from the GLM-4.6V series. Handles image, video, and document understanding at zero cost while supporting tool invocation for building multimodal agents. With 128K context, it provides substantial capability for visual understanding workflows without any API costs.

Try Now

Capabilities

Tool Use

Image Input

PDF Input

Example Use Cases

Need free vision model

Budget multimodal tasks with tool use

Zero-cost image understanding

Technical Specifications

Context Window

128,000 tokens

Max Output

24,000 tokens

Cache Miss Cost

$0 per 1M tokens

Non-Reasoning Cost

$0 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Superseded by GLM 5

Recommended Replacement

GLM 5