Join our Discord

Microsoft Azure AI

Microsoft's enterprise AI services on Azure cloud platform.

📍 🇺🇸 US / 🇪🇺 EU•28 models available•Visit Website →

28

Available Models

$0.95

Avg Input Price/M

$0.1

Cheapest Model

azure/gpt-4.1-nano@westus3

$2.00

Most Expensive

azure/gpt-4.1@uksouth

Features Overview

18

Vision Support

10

Advanced Reasoning

28

Caching Support

0

Computer Use

Privacy & Data Policy

Data Retention

No data retention

Location

🇺🇸 US / 🇪🇺 EU

Privacy Policy

Microsoft Privacy Statement →

All Microsoft Azure AI Models

View All Providers →

Microsoft Azure AI

o4-mini (westus3)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

o4-mini (francecentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-4.1 (uksouth)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-4.1-nano (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1-nano (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1-mini (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1 (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-4.1-nano (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1-nano (uksouth)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-4.1-mini (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1 (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-5 (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

gpt-4.1-mini

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

o4-mini (eastus2)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-4.1-mini (francecentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-4.1-nano (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-5

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

gpt-5 (uksouth)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

gpt-4.1-mini (eastus2)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1-mini (uksouth)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.4/M tokens

Output

$1.60/M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

View Details →

Microsoft Azure AI

gpt-4.1 (westus3)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

o4-mini (swedencentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

o4-mini (uksouth)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Microsoft Azure AI

gpt-4.1 (swedencentral)

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$2.00/M tokens

Output

$8.00/M tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

View Details →

Microsoft Azure AI

gpt-4.1-nano

Vision

Caching

Context Window

1.0M tokens

Max Output

33K tokens

Input

$0.1/M tokens

Output

$0.4/M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

View Details →

Microsoft Azure AI

gpt-5 (swedencentral)

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.25/M tokens

Output

$10.00/M tokens

View Details →

Microsoft Azure AI

o4-mini

Caching

Reasoning

Context Window

200K tokens

Max Output

100K tokens

Input

$1.10/M tokens

Output

$4.40/M tokens

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

View Details →

Ready to use Microsoft Azure AI models?

Access all Microsoft Azure AI models through Requesty's unified API with intelligent routing, caching, and cost optimization.

Get Started Free View Pricing

Microsoft Azure AI AI Models - Pricing & Features | Requesty