Grok 2 Vision

xAI

xAI's legacy vision model with 32K context supporting text and image inputs with function calling and structured outputs. Superseded by Grok 4.

Try Now

Capabilities

Tool Use

Image Input

Technical Specifications

Context Window

32,768 tokens

Max Output

32,768 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$2

Non-Reasoning Output

$10

Cache Read Input

$0

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

Image Generation

$70

Retired

Made legacy on

Reason

Outdated model

Recommended Replacement

Grok 4.20 Reasoning

Retired on