https://www.youtube.com/watch?v=STInh7qe2gA&t=2s
Are you tired of juggling multiple LLM providers and struggling with slow or unreliable AI completions? The Requesty platform solves these challenges by letting you route your calls to over 150+ models, all from one place. In this post, weâll walk through how to integrate our new Requesty VS Code extension into your workflow. Weâll also cover how to switch between providers quickly in Cline and Roo Code, highlight fallback strategies, and showcase other popular integrations like OpenWebUIâall while keeping your usage stats and cost in check.
Table of Contents
Why Use the Requesty VS Code Extension?
Getting Started
Setting Up Your Requesty API Key
Choosing & Switching Models Quickly
Cline Integration
Roo Code Integration
Policy-Based Fallbacks
Bonus Features: Usage Stats, Caching, and More
Other Integrations
OpenWebUI
VS Code Extension (Recap)
And MoreâŠ
Wrap-Up
Why Use the Requesty VS Code Extension?
Requesty unifies access to LLMs like OpenAI, Anthropic Claude, DeepSeek, Deepinfra, Nebius, Together AI, and many more into a single router endpoint. When you install the VS Code extension, you gain:
Seamless AI coding assistance
right inside VS Code, no matter which model/provider you prefer.
On-the-fly switching
between different LLMsâfor example, GPT-4 for brainstorming and Claude for code completions.
Fallback strategies
: If your primary model fails or returns errors, Requesty can seamlessly use another model to avoid downtime.
Usage analytics
: Track your API usage, token consumption, and costs in real time.
Getting Started
Install the VS Code Extension
: Search for
âRequestyâ
in the VS Code Marketplace and click âInstall.â
Sign Up for Requesty
: If you havenât already, head to app.requesty.ai/sign-up to create your account.
Obtain Your API Key
: In the Requesty dashboard, go to the
Router
page to create or copy your API key.
Once you have these pieces in place, youâll be ready to integrate your code editor with your favorite LLMsâwithout ever leaving VS Code.
Setting Up Your Requesty API Key
When you open the Requesty VS Code extension (or the integrated settings panel), youâll see a prompt to provide your API Key. Hereâs how:
Create an API Key
in Requesty (Dashboard â âCreate API Keyâ). Name it something memorable, like dev-key or cline-test.
Copy
this API key and
paste
it into the Requesty extensionâs configuration in VS Code.
Optionally, you can also specify any fallback âpolicyâ (more on this below) or advanced routing parameters.
Thatâs it! Youâve now linked VS Code with the Requesty router.
Choosing & Switching Models Quickly
One of the biggest benefits of Requesty is the ability to switch models without juggling separate API endpoints. We maintain a list of 150+ modelsâincluding GPT-4, Claude, DeepSeekâs unlimited concurrency approach, and specialized coding LLMs.
Cline Integration
Cline is a coding assistant that can run in your editor or terminal. With Requesty, you can do:
Configure Cline to Use Requesty
Open
Cline
âs settings.
Choose
âRequestyâ
as your provider.
Paste your Requesty API key and pick a model ID from the Requesty Model List (e.g., openai/gpt-4o or anthropic/claude-3-7).
:
Instant Model Switching
: If you want to switch to a different LLMâfor instance, from Claude to GPT-4âjust update the model ID in your settings. No need to reconfigure or switch accounts.
Fallback Policies
: If your chosen model times out, Requesty can automatically route your request to a second model (e.g., from DeepSeek to Nebius).
Roo Code Integration
Roo Code is another popular coding agent that helps you write, debug, and refactor code. Using Requesty:
Select âRequestyâ
as the API provider in Roo Codeâs extension or config settings.
Paste Your API Key
from Requesty.
Pick Your Model
from the Model List (or a custom âDedicated Modelâ if you have one).
Enjoy
: Roo Code will now route completions via Requesty. Switch models in seconds by updating the model parameter (e.g., together/vicuna-13b to openai/gpt-3.5-turbo).
You can create as many API keys or usage policies as you need for your workflowâespecially handy if you maintain multiple environments, like dev and production.
Policy-Based Fallbacks
A fallback policy is a lifesaver when a provider is temporarily overloaded:
In your
Requesty Dashboard
, click on
Manage API Keys
â
Add a Policy
.
Specify an order of preference. For example:
1st
: deepseek/any-latest
2nd
: anthropic/claude-3-7-sonnet-latest
3rd
: openai/gpt-3.5-turbo
Copy the
Policy
snippet and paste it into your extension config.
Now if DeepSeek is slow or returns errors, Requesty will retry automatically with Claude or GPT. You stay codingâno manual switching required.
Bonus Features: Usage Stats, Caching, and More
When youâre in the Requesty dashboard, youâll see more than just a list of models:
Usage Stats
: Track tokens used, total cost, or requests per day. Great for staying within budgets or spotting unexpected spikes in usage.
Caching
: Enable cache optimizations so that repeated requests (e.g., the same prompt or instructions) donât burn tokens each time. You can toggle caching in the âFeaturesâ or âSettingsâ panel in the dashboard.
System Prompt Optimizations
: Requesty can automatically optimize the system prompt before sending it to the model. This helps reduce token countâi.e., no more 12k tokens for a simple code request!
From the transcript example:
âWith one request, we initially used 12,800 tokens, then dropped to 8,800 tokens just by letting Requesty optimize the prompt size and context.â
This can lead to major cost savings over time.
Other Integrations
While the new VS Code extension is our latest highlight, remember that you can integrate Requesty with plenty of other tools:
OpenWebUI
Go to âAdmin Settingsâ â Switch the URL to https://router.requesty.ai/v1 â Paste your API key.
Instantly access 150+ LLMs through the familiar OpenWebUI interface.
VS Code Extension (Recap)
Search
âRequestyâ
in the VS Code Marketplace.
Configure your API key.
Choose your favorite LLM and watch the completions flow in real timeâno separate installs or keys needed for each provider.
And More
Cline
: As described above, just pick Requesty from the âAPI Providerâ dropdown, paste your key, and youâre set.
Roo Code
: Same approachâselect âRequesty,â drop in your key, and choose a model ID.
Other Tools
OpenAI Python or TypeScript SDK
Anthropic
Nebius
Deepinfra
Together AI
Custom self-hosted
solutions
: We also have integrated pathways for:
If youâre curious about an integration not listed here, join our Discord or send us a messageâchances are we can support it.
Wrap-Up
The Requesty VS Code extension makes it easy to unify your AI coding tools and avoid the hassle of maintaining different API endpoints or accounts. Whether youâre a fan of Cline, Roo Code, or standard vs-code plugins, our router ensures you can quickly switch models, manage usage, and never get stuck when a single provider goes down.
Ready to give it a spin?
Sign up for Requesty or log in.
Install the
VS Code extension
.
Add your API key and pick a model.
Write (or generate!) some code to see how smooth your new LLM workflow feels.
If you have questions or want more guidance, check out our Docs and FAQ or hop into the Requesty Discord to chat with our team and community. Enjoy error-free, multi-provider AI coding right in your favorite editor!
Questions or feedback? Drop us a line on Discord or email us at support@requesty.ai. Weâre here to help you get the most out of your LLMsâno matter which provider you prefer. Happy coding!