GPT OSS 20B

openai

GPT OSS 20B is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware.

Try Now

Capabilities

Extended Thinking

Tool Use

Example Use Cases

Lower-latency inference tasks

Consumer hardware deployment

Cost-efficient general-purpose tasks

Technical Specifications

Context Window

131,072 tokens

Max Output

131,072 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.03

Non-Reasoning Output

$0.14

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3.5 Plus