An instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. Best suited for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents. 128K context window with 4K max output.
Try NowBudget cohere rag deployment
Cost-effective tool use workflow
Code generation with cohere
128,000 tokens
4,000 tokens
$0.15 per 1M tokens
$0.60 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls