A compact 8B multimodal model excelling at a variety of critical benchmarks for language, text, and image capabilities. Focused on low latency and best-in-class performance with image understanding across multiple languages.
Try Now16,000 tokens
4,000 tokens
$0.50
$1.50
8B vision model; too small for production