GLM 5V Turbo

Zai

GLM-5V-Turbo is Zai's multimodal coding foundation model with powerful vision understanding capabilities, supporting images, video, and files. Optimized for agentic scenarios including frontend code generation from design mockups, autonomous GUI exploration, and visual debugging. Features a 200K context window and 128K max output, with support for thinking mode and tool invocation. Excels at document comprehension across PDFs, video object tracking, and webpage recreation from screenshots.

Try Now

Capabilities

Thinking

Tool Use

Image Input

PDF Input

Example Use Cases

Frontend code generation from design mockups

Visual debugging and GUI automation

Document and video understanding with agentic workflows

Technical Specifications

Context Window

200,000 tokens

Max Output

128,000 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$1.20

Non-Reasoning Output

$4

Cache Read Input

$0.24

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19