GPT OSS 120B

openai

GPT OSS 120B is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.

Try Now

Capabilities

Extended Thinking

Tool Use

Example Use Cases

High-reasoning production tasks

Agentic workflows with tool use

General-purpose reasoning with configurable depth

Technical Specifications

Context Window

131,072 tokens

Max Output

131,072 tokens

Pricing

Token Costs (per 1M tokens)

Cache Miss Input

$0.039

Non-Reasoning Output

$0.19

Tool Costs (per 1K calls)

Web Search

$15

Code Execution

$0.19

⚠️ Legacy

Made legacy on

Reason

Untested

Recommended Replacement

Qwen3.5 Plus