Requesty
Data/Latency and performance

Prompt-cache hit rate per provider, April 2026

Cache hit rate per provider, April 2026

cached_tokens divided by input_tokens. Higher is cheaper and faster.

Anthropic-direct (77%) is the cache-hit leader. Vertex Claude (14%) is the surprise: same model family, ~5× lower cache hit, almost certainly a configuration gap.* Moonshot 88% is a measurement artefact: cached_tokens still records on partial streams that the gateway later marks as failed at 6% success rate.

Which AI providers have the highest prompt-cache hit rate? In April 2026 Anthropic-direct led the Requesty gateway at 77% (cached_tokens / input_tokens), Bedrock Claude was healthy at 57%, and Vertex (Claude) trailed at 24%. Same Claude model family, 3× lower hit rate. Vertex (Gemini) sat at 10% and Mistral at 4%, the floor among major routes.

Why it mattersPrompt caching directly cuts the per-request cost of long, repeated context. The difference between a 77% hit rate and a 24% hit rate on the same model family is roughly a 3× reduction in input tokens billed at full price. The Vertex-Claude gap looks like a configuration issue rather than a platform limitation, which means Claude users on Vertex are leaving substantial savings on the table without a code change.

Period
Apr 2026
Updated
May 9, 2026
ID
cache-hit-april-2026
§ 01

Key findings

  • 01Anthropic-direct: 77% cache hit, the leader by a wide margin.
  • 02Bedrock Claude: 57%. OpenAI: 36%. DeepSeek: 48%. Healthy.
  • 03Vertex (Claude): 24%. Same model as Anthropic-direct (77%) and Bedrock (57%), 3× lower hit rate. Configuration gap.
  • 04Vertex (Gemini): 10%. The floor among major routes.
  • 05Mistral: 4%. Roughly the floor; prompt caching is not a meaningful lever on that route today.
  • 06Moonshot reports 88% but it is a measurement artefact at 6% success rate; do not quote it.
§ 02

Data

ProviderCache hit rate(percent)
Anthropic77.50%
Bedrock56.90%
DeepSeek48.30%
Azure41.00%
OpenAI36.40%
xAI35.70%
Novita31.90%
Vertex (Claude)23.50%
Vertex (Gemini)9.60%
Mistral4.10%
§ 03

Cite as

APA
Click to copy
BibTeX
Click to copy
§ 04

Cited in

ID: cache-hit-april-2026·Updated May 9, 2026·Period Apr 2026