Mercury is a diffusion-based large language model from Inception Labs, designed for ultra-fast inference with sub-second latency. It supports a 128K context window, native tool calling, and structured outputs. Mercury excels at general-purpose reasoning, chat, and agent workflows where speed is paramount.
Try Now128,000 tokens
16,384 tokens
$0.25
$1
$15
$0.19