Requesty just became the most affordable place to run Google’s latest Gemini models. Thanks to the optimisations we rolled out for our new enforced-caching layer, we’re slashing prices by a full 50 % on both Gemini 2.5 Pro and Gemini 2.5 Flash—effective immediately.
What’s New?
1. Half-Price Gemini Models
Model | Input (old → new ) | Output (old → new ) |
Gemini 2.5 Pro
| $1.25 → $0.62 per M tokens | $10.00 → $5.00 per M tokens |
Gemini 2.5 Flash
| $0.15 → $0.075 per M tokens | $3.50 → $1.75 per M tokens |
That’s the best public pricing you’ll find—beating OpenRouter by a mile.
2. Smarter, Cheaper Caching
Our one-hour prompt cache now covers Gemini calls too. Repeat inputs in agent loops hit the cache at just 25 % of the already-discounted input rate, so complex chains run faster and cheaper.
3. Plug-and-Play Tooling (No Code Changes)
Point your favourite environment at the Requesty endpoint, pick coding/gemini-2.5-pro
or coding/gemini-2.5-flash
, and you’re off. Works everywhere you do:
Cline – terminal-first power with “extended reasoning” toggle
OpenWebUI
Kilo Code
Roo Code – inline multi-file refactors on the cheap
Aider & Continue – persistent sessions, now at half the cost
…and any custom stack that speaks HTTP.
Why Choose Requesty for Gemini 2.5?
Requesty Advantage | What it Means for You |
Enforced Prompt Caching | Up to 75 % lower spend on iterative prompts |
Unified Billing | One invoice for Google, Anthropic, OpenAI & more |
Parallel Tool Execution | Web search, Python sandbox & file ops in a single call |
Rock-Solid Routing | Automatic retries & multi-zone failover keep your app live |
Transparent Analytics | Real-time dashboards for tokens, cache hits & latency |
Get Started in 3 Minutes
Sign in / Create your Requesty account (new users still get $6 free credits).
Enable Gemini 2.5: Navigate to Models → Google and flip on
gemini-2.5-pro
orgemini-2.5-flash
.Toggle Enforced Caching in Project Settings to unlock the biggest savings.
Update Your Tool Config:
jsoncCopyEdit// cline / roo / etc. { "endpoint": "https://router.requesty.ai/v1", "model": "coding/gemini-2.5-pro" // or gemini-2.5-flash }
That’s it—ship your next feature, run larger evals, or spin up agentic workflows for half the usual cost.
Limited-Time Pricing, Unlimited Possibilities
Whether you’re building a lightning-fast chat assistant or a multi-hour autonomous coder, Gemini 2.5 on Requesty now offers frontier performance at bargain-bin prices. Dive in, break things (cheaply), and tell us what else you’d like to see.