Microsoft Azure AI AI Models - Pricing & Features

Microsoft Azure AI

gpt-5 (uksouth)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

gpt-4.1-mini (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1-mini (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1 (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-4.1 (uksouth)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-5-mini

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.25/M tokens

Output

$2.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-nano

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1 (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-4.1-mini (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

o4-mini (swedencentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-4.1

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-5 (swedencentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

o4-mini (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-5-mini (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.25/M tokens

Output

$2.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-mini

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-5.1 (francecentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

gpt-5

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-5.2-codex (eastus2)

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.75/M tokens

Output

$14.00/M tokens

OpenAI's most intelligent coding model optimized for long-horizon, agentic coding tasks.

View Details →

Microsoft Azure AI

gpt-5.2 (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.75/M tokens

Output

$14.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1 (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-4.1-nano

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-5-mini (swedencentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.25/M tokens

Output

$2.00/M tokens

View Details →

Microsoft Azure AI

gpt-5 (francecentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-5.2-codex

Vision

Caching

Reasoning

Context Window

400K tokens

Max Output

128K tokens

Input

$1.75/M tokens

Output

$14.00/M tokens

OpenAI's most intelligent coding model optimized for long-horizon, agentic coding tasks.

View Details →

Microsoft Azure AI

o4-mini (westus3)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1 (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-5-nano (swedencentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.05/M tokens

Output

$0.4/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-mini (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

o4-mini

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-5.2

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.75/M tokens

Output

$14.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-nano (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-nano (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-mini (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-mini (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

o4-mini (francecentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-5.1 (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1 (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-nano (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-5-mini (uksouth)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.25/M tokens

Output

$2.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-mini (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-5-mini (francecentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.25/M tokens

Output

$2.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1-nano (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1-nano (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1-nano (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-5-nano (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.05/M tokens

Output

$0.4/M tokens

View Details →

Microsoft Azure AI

gpt-5-nano (francecentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.05/M tokens

Output

$0.4/M tokens

View Details →

Microsoft Azure AI

gpt-4.1-nano (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1-nano (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1 (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-5 (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

gpt-5-nano

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$0.05/M tokens

Output

$0.4/M tokens

View Details →

Microsoft Azure AI

gpt-4.1-mini

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1-mini (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

o4-mini (uksouth)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-5.1

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

gpt-4.1-mini (uksouth)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1-nano (uksouth)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1 (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-5.1 (swedencentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

openai-responses/gpt-4.1 (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

Features Overview

Privacy & Data Policy

Data Retention

Location

Privacy Policy

All Microsoft Azure AI Models

gpt-5 (uksouth)

gpt-4.1-mini (westus3)

gpt-4.1-mini (francecentral)

gpt-4.1 (eastus2)

gpt-4.1 (uksouth)

gpt-5-mini

openai-responses/gpt-4.1-nano

gpt-4.1 (swedencentral)

gpt-4.1-mini (eastus2)

o4-mini (swedencentral)

gpt-4.1

gpt-5 (swedencentral)

o4-mini (eastus2)

gpt-5-mini (eastus2)

openai-responses/gpt-4.1-mini

gpt-5.1 (francecentral)

gpt-5

openai-responses/gpt-5.2-codex (eastus2)

gpt-5.2 (eastus2)

openai-responses/gpt-4.1 (francecentral)

gpt-4.1-nano

gpt-5-mini (swedencentral)

gpt-5 (francecentral)

openai-responses/gpt-5.2-codex

o4-mini (westus3)

openai-responses/gpt-4.1 (swedencentral)

gpt-5-nano (swedencentral)

openai-responses/gpt-4.1-mini (francecentral)

o4-mini

gpt-5.2

openai-responses/gpt-4.1

openai-responses/gpt-4.1-nano (eastus2)

openai-responses/gpt-4.1-nano (westus3)

openai-responses/gpt-4.1-mini (eastus2)

openai-responses/gpt-4.1-mini (swedencentral)

o4-mini (francecentral)

gpt-5.1 (eastus2)

openai-responses/gpt-4.1 (eastus2)

openai-responses/gpt-4.1-nano (swedencentral)

gpt-5-mini (uksouth)

openai-responses/gpt-4.1-mini (westus3)

gpt-5-mini (francecentral)

openai-responses/gpt-4.1-nano (francecentral)

gpt-4.1-nano (westus3)

gpt-4.1-nano (swedencentral)

gpt-5-nano (eastus2)

gpt-5-nano (francecentral)

gpt-4.1-nano (eastus2)

gpt-4.1-nano (francecentral)

gpt-4.1 (francecentral)

gpt-5 (eastus2)

gpt-5-nano

gpt-4.1-mini

gpt-4.1-mini (swedencentral)

o4-mini (uksouth)

gpt-5.1

gpt-4.1-mini (uksouth)

gpt-4.1-nano (uksouth)

gpt-4.1 (westus3)

gpt-5.1 (swedencentral)

openai-responses/gpt-4.1 (westus3)

Ready to use Microsoft Azure AI models?