GLM-5V-Turbo is Zai's multimodal coding foundation model with powerful vision understanding capabilities, supporting images, video, and files. Optimized for agentic scenarios including frontend code generation from design mockups, autonomous GUI exploration, and visual debugging. Features a 200K context window and 128K max output, with support for thinking mode and tool invocation. Excels at document comprehension across PDFs, video object tracking, and webpage recreation from screenshots.
Try Now200,000 tokens
128,000 tokens
$1.20
$4
$0.24
$15
$0.19