GLM-5V-Turbo is Zai's multimodal coding foundation model with powerful vision understanding capabilities, supporting images, video, and files. Optimized for agentic scenarios including frontend code generation from design mockups, autonomous GUI exploration, and visual debugging. Features a 200K context window and 128K max output, with support for thinking mode and tool invocation. Excels at document comprehension across PDFs, video object tracking, and webpage recreation from screenshots.
Try NowFrontend code generation from design mockups
Visual debugging and GUI automation
Document and video understanding with agentic workflows
200,000 tokens
128,000 tokens
$1.20
$4
$0.24
$15
$0.19