Claude 4 Now Available on Requesty

We’re thrilled to announce that Claude 4—both Opus 4 and Sonnet 4—is now live on Requesty, your go-to LLM gateway and routing platform. Experience Anthropic’s most powerful models seamlessly integrated into your favorite developer tools, complete with Requesty’s reliable prompt-caching layer to accelerate responses and reduce costs.


What’s New?

1. Hybrid Models for Every Use Case

  • Claude Opus 4

    • World’s best coding model: 72.5 % on SWE-bench, 43.2 % on Terminal-bench.

    • Sustained multi-hour workflows: Tackle thousands of steps without losing context.

    • Frontier agent performance: Ideal for complex, autonomous pipelines.

  • Claude Sonnet 4

    • Industry-leading reasoning: 72.7 % on SWE-bench with extended thinking.

    • Steerable and efficient: Great balance of speed, precision, and cost.

    • Free-tier availability: Perfect for prototyping and lighter tasks.

2. Enforced Prompt Caching on Requesty

  • Cache Validity: Up to one hour per prompt, ensuring instant cold-start performance.

  • Cost Savings: Cached inputs incur just 25% of the normal input rate—ideal for repeat calls in agent loops.

  • Cache Control: Developers can tag, invalidate, or override caches via API flags.

3. Full Tool Support, Optimized for Parallel Calls

  • Extended Thinking with Web Search, Code Execution, and Image Transformations all available through Requesty’s standard tool interface.

  • Parallel Tool Execution: Fire off web searches, invoke your Python sandbox, and call local file tools simultaneously for lightning-fast, multi-faceted reasoning.

4. Deep IDE Integrations

Plug Claude 4 into the coding tools you already love:

Tool

Highlights

Roo Code

Inline multi-file refactoring, advanced debugging, and CI feedback loops.

Cline

Terminal-first experience with toggleable “extended reasoning” mode.

Aider

Code suggestions in your local editor, now harnessing Claude 4’s precision.

Continue

Session persistence across your workflows, leveraging enforced caching.

Just point your IDE at Requesty’s API endpoint, choose claude-opus-4 or claude-sonnet-4, and you’re off to the races.


Why Choose Requesty for Claude 4?

  1. Unified Billing & Transparent Pricing

    • Input Tokens: $15 / $3 per million (Opus 4 / Sonnet 4)

    • Cached Input: 25% of input rate

    • Output Tokens: $75 / $15 per million

  2. Custom Tool Combinations

    • Chain web searches, code runs, and file edits in a single request.

    • Fine-tune call order or parallelism via our JSON workflow spec.

  3. Superior Throughput & Reliability

    • Optimized request routing with retry logic and fallbacks to secondary zones.

    • Detailed analytics dashboard for token usage, cache hit rates, and latency.


Getting Started

  1. Sign In or Create a Requesty Account

  2. Add Claude 4 to Your Plan: Head to the Models page and enable claude-opus-4 or claude-sonnet-4.

  3. Configure Prompt Caching: Toggle “Enforced Caching” in your project settings to start saving instantly.

  4. Integrate with Your Tools:

    • Roo Code: In advanced settings, set your Requesty endpoint and pick Claude 4.

    • Cline: Update your requesty.config.json to "model": "claude-opus-4" (or Sonnet).

    • Aider & Continue: Select Requesty under “Providers” and choose your model.


Try It Today

Empower your applications, bots, and agentic workflows with the next generation of Claude models, now turbo-charged by Requesty’s prompt caching and robust routing. We can’t wait to see what you build—whether it’s a multi-hour code refactor, a sophisticated research assistant, or the next great AI-powered product.

Get Started → Visit Requesty Dashboard

Stay tuned for more features, benchmarks, and deep dives coming soon. As always, your feedback fuels our improvements—drop us a line on Discord or in our support portal.

Happy building!