Ling 2.6 Flash

inclusionAI

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency.

Capabilities

Tool Use

Technical Specifications

Context Window

262,144 tokens

Max Output

32,768 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0

Non-Reasoning Output

$0

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Legacy

Made legacy on April 24, 2026

Reason

Untested free tier; smaller variant of the 1T flagship

Recommended Replacement