A lightweight, high-speed multimodal model from the GLM-4.6V series with native function calling and thinking mode support. Delivers fast visual understanding at a fraction of the cost of the flagship GLM-4.6V while maintaining strong capabilities across image, video, and document tasks. Ideal for production multimodal agents requiring low latency and affordable pricing.
Try NowNeed fast affordable vision model
Lightweight multimodal agent tasks
High-speed image understanding with tools
128,000 tokens
24,000 tokens
$0.04 per 1M tokens
$0.40 per 1M tokens
$0.004 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls