A 12B model with image understanding capabilities in addition to text.
128,000 tokens
4,000 tokens
$0.15 per 1M tokens
8 files
Poor tool calling capabilities
$0 per month