OpenAI Responses

OpenAI models optimized for structured response formats and agentic coding tasks.

📍 🇺🇸 US•15 models available•Visit Website →

Available Models

$3.23

Avg Input Price/M

$0.05

Cheapest Model

openai-responses/gpt-5-nano

$20.00

Most Expensive

openai-responses/o3-pro

Features Overview

Vision Support

Advanced Reasoning

Caching Support

Computer Use

Privacy & Data Policy

Data Retention

Yes (30 days)

Location

🇺🇸 US

Privacy Policy

OpenAI Privacy Policy →

All OpenAI Responses Models

View All Providers →

OpenAI Responses

gpt-5.1

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

GPT-5.1 is our flagship model for coding and agentic tasks with configurable reasoning and non-reasoning effort.

View Details →

OpenAI Responses

gpt-5.2

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.75/M tokens

Output

$14.00/M tokens

The best model for coding and agentic tasks across industries

View Details →

OpenAI Responses

gpt-5.1-codex

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

GPT-5.1-Codex is a version of GPT-5 optimized for agentic coding tasks in Codex or similar environments

View Details →

OpenAI Responses

gpt-5-nano

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$0.05/M tokens

Output

$0.4/M tokens

GPT-5 nano is OpenAI's fastest, cheapest version of GPT-5. It's great for summarization and classification tasks.

View Details →

OpenAI Responses

gpt-5-pro

Vision

Reasoning

Context Window

400K tokens

Max Output

272K tokens

Input

$15.00/M tokens

Output

$120.00/M tokens

GPT-5 Pro is OpenAI’s extended-reasoning tier of GPT-5, built to push reliability on hard problems, long tool chains, and agentic workflows. It keeps GPT-5’s multimodal skills and very large context (API page lists up to 400K tokens) while allocating more compute to think longer and plan better, improving code generation, math, and complex writing beyond standard GPT-5/“Thinking.” OpenAI positions Pro as the version that “uses extended reasoning for even more comprehensive and accurate answers,” targeting high-stakes tasks and enterprise use.

View Details →

OpenAI Responses

gpt-5.2-codex

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.75/M tokens

Output

$14.00/M tokens

GPT-5.2-Codex is a version of GPT-5.2 optimized for agentic coding tasks in Codex or similar environments

View Details →

OpenAI Responses

gpt-4.1-nano

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

OpenAI Responses

gpt-5-mini

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$0.25/M tokens

Output

$2.00/M tokens

GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts.

View Details →

OpenAI Responses

o3-pro

Vision

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$20.00/M tokens

Output

$80.00/M tokens

The o3 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. The o1 reasoning model is designed to solve hard problems across domains. The knowledge cutoff for o1 and o1-mini models is October, 2023.

View Details →

OpenAI Responses

gpt-5

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

GPT-5 is OpenAI's flagship model for coding, reasoning, and agentic tasks across domains.

View Details →

OpenAI Responses

gpt-4.1

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

OpenAI Responses

gpt-5-codex

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

GPT-5-Codex is a version of GPT-5 optimized for agentic coding tasks in Codex or similar environments

View Details →

OpenAI Responses

gpt-4.1-mini

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

OpenAI Responses

o3-mini

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

OpenAI Responses

o4-mini

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

View Details →

Ready to use OpenAI Responses models?

Access all OpenAI Responses models through Requesty's unified API with intelligent routing, caching, and cost optimization.

Get Started Free View Pricing