A compact 8B multilingual model designed to rival monolingual performance through innovations in instruction tuning with data arbitrage, preference training, and model merging. Serves 23 languages with fast response times and low latency. Ideal for high-throughput multilingual workloads where cost and speed matter.
Try NowFast multilingual generation
Budget-friendly cross-lingual task
High-throughput multilingual processing
8,000 tokens
4,000 tokens
$0.50 per 1M tokens
$1.50 per 1M tokens
$15 per 1K calls
$0.19 per 1K calls