We’re thrilled to announce that Claude 4—both Opus 4 and Sonnet 4—is now live on Requesty, your go-to LLM gateway and routing platform. Experience Anthropic’s most powerful models seamlessly integrated into your favorite developer tools, complete with Requesty’s reliable prompt-caching layer to accelerate responses and reduce costs.
What’s New?
1. Hybrid Models for Every Use Case
Claude Opus 4
World’s best coding model: 72.5 % on SWE-bench, 43.2 % on Terminal-bench.
Sustained multi-hour workflows: Tackle thousands of steps without losing context.
Frontier agent performance: Ideal for complex, autonomous pipelines.
Claude Sonnet 4
Industry-leading reasoning: 72.7 % on SWE-bench with extended thinking.
Steerable and efficient: Great balance of speed, precision, and cost.
Free-tier availability: Perfect for prototyping and lighter tasks.
2. Enforced Prompt Caching on Requesty
Cache Validity: Up to one hour per prompt, ensuring instant cold-start performance.
Cost Savings: Cached inputs incur just 25% of the normal input rate—ideal for repeat calls in agent loops.
Cache Control: Developers can tag, invalidate, or override caches via API flags.
3. Full Tool Support, Optimized for Parallel Calls
Extended Thinking with Web Search, Code Execution, and Image Transformations all available through Requesty’s standard tool interface.
Parallel Tool Execution: Fire off web searches, invoke your Python sandbox, and call local file tools simultaneously for lightning-fast, multi-faceted reasoning.
4. Deep IDE Integrations
Plug Claude 4 into the coding tools you already love:
Tool | Highlights |
Roo Code | Inline multi-file refactoring, advanced debugging, and CI feedback loops. |
Cline | Terminal-first experience with toggleable “extended reasoning” mode. |
Aider | Code suggestions in your local editor, now harnessing Claude 4’s precision. |
Continue | Session persistence across your workflows, leveraging enforced caching. |
Just point your IDE at Requesty’s API endpoint, choose claude-opus-4
or claude-sonnet-4
, and you’re off to the races.
Why Choose Requesty for Claude 4?
Unified Billing & Transparent Pricing
Input Tokens: $15 / $3 per million (Opus 4 / Sonnet 4)
Cached Input: 25% of input rate
Output Tokens: $75 / $15 per million
Custom Tool Combinations
Chain web searches, code runs, and file edits in a single request.
Fine-tune call order or parallelism via our JSON workflow spec.
Superior Throughput & Reliability
Optimized request routing with retry logic and fallbacks to secondary zones.
Detailed analytics dashboard for token usage, cache hit rates, and latency.
Getting Started
Sign In or Create a Requesty Account
Add Claude 4 to Your Plan: Head to the Models page and enable
claude-opus-4
orclaude-sonnet-4
.Configure Prompt Caching: Toggle “Enforced Caching” in your project settings to start saving instantly.
Integrate with Your Tools:
Roo Code: In advanced settings, set your Requesty endpoint and pick Claude 4.
Cline: Update your
requesty.config.json
to"model": "claude-opus-4"
(or Sonnet).Aider & Continue: Select Requesty under “Providers” and choose your model.
Try It Today
Empower your applications, bots, and agentic workflows with the next generation of Claude models, now turbo-charged by Requesty’s prompt caching and robust routing. We can’t wait to see what you build—whether it’s a multi-hour code refactor, a sophisticated research assistant, or the next great AI-powered product.
Get Started → Visit Requesty Dashboard
Stay tuned for more features, benchmarks, and deep dives coming soon. As always, your feedback fuels our improvements—drop us a line on Discord or in our support portal.
Happy building!