Which LLM provider is most reliable in production?

How reliable is each LLM provider in production? In April 2026 the top eight providers on the Requesty gateway (OpenAI, Anthropic, Vertex (Gemini), Bedrock, DeepSeek, Novita, xAI) sat at 95-99% success rate. Azure trailed at 78%, Vertex (Claude) at 84%, Mistral at 86%, and Moonshot at 6%, a real reliability outlier. Streaming adoption is bimodal too: Azure 68%, Anthropic 57%, everyone else under 30%. Provider success rate translates directly into user-visible failures unless an application has a managed fallback chain. The 95-99% top tier is comfortably reliable; Vertex (Claude) and Azure visibly failing roughly 1 in 5 calls demands either a routing policy or active provider switching at the application layer to avoid sustained user pain.

What is the success rate of OpenAI vs Anthropic vs Vertex?

How reliable is each LLM provider in production? In April 2026 the top eight providers on the Requesty gateway (OpenAI, Anthropic, Vertex (Gemini), Bedrock, DeepSeek, Novita, xAI) sat at 95-99% success rate. Azure trailed at 78%, Vertex (Claude) at 84%, Mistral at 86%, and Moonshot at 6%, a real reliability outlier. Streaming adoption is bimodal too: Azure 68%, Anthropic 57%, everyone else under 30%. Provider success rate translates directly into user-visible failures unless an application has a managed fallback chain. The 95-99% top tier is comfortably reliable; Vertex (Claude) and Azure visibly failing roughly 1 in 5 calls demands either a routing policy or active provider switching at the application layer to avoid sustained user pain.

Why do some LLM providers fail more often than others?

How reliable is each LLM provider in production? In April 2026 the top eight providers on the Requesty gateway (OpenAI, Anthropic, Vertex (Gemini), Bedrock, DeepSeek, Novita, xAI) sat at 95-99% success rate. Azure trailed at 78%, Vertex (Claude) at 84%, Mistral at 86%, and Moonshot at 6%, a real reliability outlier. Streaming adoption is bimodal too: Azure 68%, Anthropic 57%, everyone else under 30%. Provider success rate translates directly into user-visible failures unless an application has a managed fallback chain. The 95-99% top tier is comfortably reliable; Vertex (Claude) and Azure visibly failing roughly 1 in 5 calls demands either a routing policy or active provider switching at the application layer to avoid sustained user pain.

How widely is streaming adopted across LLM providers?

How reliable is each LLM provider in production? In April 2026 the top eight providers on the Requesty gateway (OpenAI, Anthropic, Vertex (Gemini), Bedrock, DeepSeek, Novita, xAI) sat at 95-99% success rate. Azure trailed at 78%, Vertex (Claude) at 84%, Mistral at 86%, and Moonshot at 6%, a real reliability outlier. Streaming adoption is bimodal too: Azure 68%, Anthropic 57%, everyone else under 30%. Provider success rate translates directly into user-visible failures unless an application has a managed fallback chain. The 95-99% top tier is comfortably reliable; Vertex (Claude) and Azure visibly failing roughly 1 in 5 calls demands either a routing policy or active provider switching at the application layer to avoid sustained user pain.

Data/Reliability and ops/Apr 2026

Operational metrics per provider, April 2026

Name: Operational metrics per provider, April 2026
Creator: Requesty
License: https://creativecommons.org/licenses/by/4.0/
Keywords: Reliability and ops, LLM, gateway, provider, metrics, Which LLM provider is most reliable in production?, What is the success rate of OpenAI vs Anthropic vs Vertex?, Why do some LLM providers fail more often than others?, How widely is streaming adopted across LLM providers?

How reliable is each LLM provider in production? In April 2026 the top eight providers on the Requesty gateway (OpenAI, Anthropic, Vertex (Gemini), Bedrock, DeepSeek, Novita, xAI) sat at 95-99% success rate. Azure trailed at 78%, Vertex (Claude) at 84%, Mistral at 86%, and Moonshot at 6%, a real reliability outlier. Streaming adoption is bimodal too: Azure 68%, Anthropic 57%, everyone else under 30%.

Why it mattersProvider success rate translates directly into user-visible failures unless an application has a managed fallback chain. The 95-99% top tier is comfortably reliable; Vertex (Claude) and Azure visibly failing roughly 1 in 5 calls demands either a routing policy or active provider switching at the application layer to avoid sustained user pain.

Period

Apr 2026

Updated

May 9, 2026

ID

ops-metrics-april-2026

§ 01

Key findings

01Success is bimodal: top tier at 95 to 99%, Vertex (Claude) 84%, Azure 78%, Mistral 86%, Moonshot 6%.
02Streaming adoption is bimodal: Azure 68% and Anthropic 57%. Vertex (Claude) at 28%. Everyone else <10%.
03Cache hit rate ranges from Anthropic-direct 77% to Vertex (Claude) 24% (same model family, 3x spread).

§ 02

Data

Provider	Success rate(percent)	Streaming(percent)	Cache hit(percent)
xAI	99.30%	1.30%	35.70%
DeepSeek	98.30%	2.80%	48.30%
OpenAI	98.00%	7.20%	36.40%
Novita	97.20%	2.30%	31.90%
Anthropic	96.00%	56.90%	77.50%
Vertex (Gemini)	95.90%	3.70%	9.60%
Bedrock	95.60%	9.70%	56.90%
Mistral	86.30%	8.00%	4.10%
Vertex (Claude)	84.40%	27.60%	23.50%
Azure	78.00%	68.30%	41.00%
Moonshot	6.20%	4.80%	88.20%

§ 03

Cite as

APA

Click to copy

BibTeX

Click to copy

§ 04

Cited in

What the gateway saw in April 2026/blog/provider-trends-april-2026-agentic-share-latency

ID: ops-metrics-april-2026·Updated May 9, 2026·Period Apr 2026