The efficiency-optimized DeepSeek V4 model with 284B parameters (13B activated) and a massive 1M token context window. Built on a new hybrid attention architecture (Compressed Sparse Attention + Heavily Compressed Attention) for dramatically lower cost on long contexts, it runs at just a fraction of V3.2 inference cost while matching or exceeding its quality. Switches between fast non-thinking responses and explicit chain-of-thought reasoning with configurable effort (up to "max" for the hardest problems). Tool calls are supported in both modes. Ideal for high-throughput chat, coding assistance, and agent workflows over large documents or codebases.
Try Now1,000,000 tokens
131,072 tokens
$0
$0
$15
$0.19