Aya Vision 8B

Cohere

A compact 8B multimodal model excelling at a variety of critical benchmarks for language, text, and image capabilities. Focused on low latency and best-in-class performance with image understanding across multiple languages.

Try Now

Capabilities

Image Input

Technical Specifications

Context Window

16,000 tokens

Max Output

4,000 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.50

Non-Reasoning Output

$1.50

Legacy

Made legacy on

Reason

8B vision model; too small for production

Recommended Replacement

Command A Vision