OpenAI Responses

OpenAI models optimized for structured response formats and agentic coding tasks.

📍 🇺🇸 US•12 models available•Visit Website →
12
Available Models
$3.65
Avg Input Price/M
$0.05
Cheapest Model
openai-responses/gpt-5-nano
$20.00
Most Expensive
openai-responses/o3-pro

Features Overview

10
Vision Support
9
Advanced Reasoning
10
Caching Support
0
Computer Use

Privacy & Data Policy

Data Retention

Yes (30 days)

Location

🇺🇸 US

All OpenAI Responses Models

View All Providers →
OpenAI Responses

o3-pro

Vision
Reasoning
Context Window
200K tokens
Max Output
100K tokens
Input
$20.00/M tokens
Output
$80.00/M tokens

The o3 series of models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. The o1 reasoning model is designed to solve hard problems across domains. The knowledge cutoff for o1 and o1-mini models is October, 2023.

OpenAI Responses

gpt-5-pro

Vision
Reasoning
Context Window
400K tokens
Max Output
272K tokens
Input
$15.00/M tokens
Output
$120.00/M tokens

GPT-5 Pro is OpenAI’s extended-reasoning tier of GPT-5, built to push reliability on hard problems, long tool chains, and agentic workflows. It keeps GPT-5’s multimodal skills and very large context (API page lists up to 400K tokens) while allocating more compute to think longer and plan better, improving code generation, math, and complex writing beyond standard GPT-5/“Thinking.” OpenAI positions Pro as the version that “uses extended reasoning for even more comprehensive and accurate answers,” targeting high-stakes tasks and enterprise use.

OpenAI Responses

gpt-4.1-mini

Vision
Caching
Context Window
1.0M tokens
Max Output
33K tokens
Input
$0.4/M tokens
Output
$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

OpenAI Responses

gpt-4.1-nano

Vision
Caching
Context Window
1.0M tokens
Max Output
33K tokens
Input
$0.1/M tokens
Output
$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

OpenAI Responses

gpt-5.1-codex

Vision
Caching
Reasoning
Context Window
400K tokens
Max Output
128K tokens
Input
$1.25/M tokens
Output
$10.00/M tokens

GPT-5.1-Codex is a version of GPT-5 optimized for agentic coding tasks in Codex or similar environments

OpenAI Responses

gpt-5-nano

Vision
Caching
Reasoning
Context Window
400K tokens
Max Output
128K tokens
Input
$0.05/M tokens
Output
$0.4/M tokens

GPT-5 nano is OpenAI's fastest, cheapest version of GPT-5. It's great for summarization and classification tasks.

OpenAI Responses

o3-mini

Caching
Reasoning
Context Window
200K tokens
Max Output
100K tokens
Input
$1.10/M tokens
Output
$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

OpenAI Responses

gpt-5-mini

Vision
Caching
Reasoning
Context Window
400K tokens
Max Output
128K tokens
Input
$0.25/M tokens
Output
$2.00/M tokens

GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts.

OpenAI Responses

gpt-5

Vision
Caching
Reasoning
Context Window
400K tokens
Max Output
128K tokens
Input
$1.25/M tokens
Output
$10.00/M tokens

GPT-5 is OpenAI's flagship model for coding, reasoning, and agentic tasks across domains.

OpenAI Responses

gpt-4.1

Vision
Caching
Context Window
1.0M tokens
Max Output
33K tokens
Input
$2.00/M tokens
Output
$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

OpenAI Responses

gpt-5-codex

Vision
Caching
Reasoning
Context Window
400K tokens
Max Output
128K tokens
Input
$1.25/M tokens
Output
$10.00/M tokens

GPT-5-Codex is a version of GPT-5 optimized for agentic coding tasks in Codex or similar environments

OpenAI Responses

o4-mini

Caching
Reasoning
Context Window
200K tokens
Max Output
100K tokens
Input
$1.10/M tokens
Output
$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

Ready to use OpenAI Responses models?

Access all OpenAI Responses models through Requesty's unified API with intelligent routing, caching, and cost optimization.

OpenAI Responses AI Models - Pricing & Features | Requesty