Gemma 4 26B A4B

Google

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference, delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

Try Now

Capabilities

Thinking

Tool Use

Image Input

Technical Specifications

Context Window

262,144 tokens

Max Output

262,144 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.13

Non-Reasoning Output

$0.40

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19