Gemma 3n 2B

google

Gemma 3n E2B IT is a multimodal, instruction-tuned model developed by Google DeepMind, designed to operate efficiently at an effective parameter size of 2B while leveraging a 6B architecture. Based on the MatFormer architecture, it supports nested submodels and modular composition via the Mix-and-Match framework. Gemma 3n models are optimized for low-resource deployment, offering 32K context length and strong multilingual and reasoning performance across common benchmarks. This variant is trained on a diverse corpus including code, math, web, and multimodal data.

Try Now

Example Use Cases

Free model for simple tasks

On-device or mobile deployment

Ultra-low-resource task

Technical Specifications

Context Window

8,192 tokens

Max Output

2,048 tokens

Cache Miss Cost

$0 per 1M tokens

Non-Reasoning Cost

$0 per 1M tokens

Web Search Cost

$15 per 1K calls

Code Execution Cost

$0.19 per 1K calls

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3.5 Plus