Unifying Speed, Depth, and Tool Access
We’re excited to bring you three new, game‑changing models on Requesty—the alternative to Openrouter, LiteLLM and other routing platforms:
GPT‑4.1: Balanced speed + extended reasoning in one LLM.
OpenAI o4‑mini: A cost‑efficient, “mini” model that still excels at math, coding, and more.
OpenAI o3: Our advanced reasoning powerhouse for multi-step tasks, image analysis, and agentic workflows.
With Requesty, these models integrate seamlessly into the coding and chat tools you already use—Roo Code, Cline, OpenWebUI, and more—helping you solve bigger problems, faster.
Highlights & Benchmarks
1. Unified AI Reasoning
GPT‑4.1 merges rapid Q&A with deeper, step‑by‑step thinking—no need to pick separate “fast” or “reflective” variants. The same goes for o3 and o4‑mini, each offering:
Tool Integration: They can search the web, run Python code, manipulate images, or orchestrate multi-step logic.
Extended vs. Instant: Configure a “thinking tokens” limit if you want to force quicker or more in‑depth reasoning.
2. Coding & Developer Performance
All three models show major improvements in real coding tasks—from multi‑file refactoring to advanced debugging. They handle code diffs more reliably, and can integrate with dev tools:
Roo Code: For inline coding suggestions, multi‑file refactoring, advanced debugging, and minimal extraneous edits.
Cline: Quick, powerful dev and terminal integration for short or extended coding tasks.
OpenWebUI: A user-friendly chat environment that seamlessly pairs your conversation with tool usage—just pick the model in settings.
Benchmarks highlight these capabilities:
SWE-Bench Verified
GPT‑4.1 / o3 surpass older GPT or o‑series models, with 60–70% pass rates in real software engineering tasks.
o4‑mini still outperforms older “mini” models at significantly lower cost.
Aider Polyglot (Code Editing)
GPT‑4.1 / o3 approach an ~80% success rate for multi‑language editing in diff formats.
o4‑mini offers ~60–70% success at minimal expense.
3. Visual & Multimodal Mastery
GPT‑4.1 and o3 interpret images natively, solving tasks that combine textual + visual reasoning. They can zoom, rotate, or transform images in the context of an agentic workflow. Evaluations like MMMU (college-level visual tasks) or MathVista (visual math) show:
o3 near state‑of‑the‑art for advanced figure reasoning.
o4‑mini remains strong in baseline visual tasks.
4. Agentic Tool Use
All three models:
Learn to reason about when and how to call external tools (search, code, or image transformations).
Switch strategies mid‑conversation—like re‑searching if new info is found.
Reliably produce the final answer in a requested format, typically under a minute.
5. Verified Safety & Reliability
Each model has updated refusal prompts, better injection resistance, and advanced filters for harmful queries. We extensively tested them in line with Requesty’s safety protocols, ensuring consistent guardrails—while reducing needless refusals for normal user prompts.
Pricing on Requesty
We’re keeping it simple: pay for tokens used (and an additional fee per tool call, if any). Below are prices per 1 million tokens:
GPT‑4.1
Input: $2.00
Cached Input: $0.50
Output: $8.00
OpenAI o4‑mini
Input: $1.10
Cached Input: $0.275
Output: $4.40
OpenAI o3
Input: $10.00
Cached Input: $2.50
Output: $40.00
(Tool calls—e.g. web search, code execution—cost extra per call; see Requesty docs for details.)
Integrating with Your Favorite Tools
Roo Code
In the advanced settings, pick your Requesty endpoint and choose the model.
Enjoy multi‑file editing, advanced debugging, and synergy with your local dev environment.
Cline
In the advanced settings, pick your Requesty endpoint and choose the model.
Toggle “extended reasoning” if you want GPT‑4.1 or o3 to think more thoroughly before responding.
OpenWebUI
Open your Providers panel, set Requesty’s base URL, and pick your new model.
Watch them handle free-flow chat and tool usage on demand.
(Prefer a different interface? Our partners include Aider, Goose, Crew AI, and many more—just select the new Requesty models in their respective settings.)
Why Choose Requesty?
Alternative to OpenRouter We offer a straightforward developer experience, unified billing, and special features like advanced caching or custom tool combos.
Tooling Ecosystem Our platform is designed to host multiple tool endpoints for each model—script your entire agentic workflow in one place.
Transparent Pricing We highlight input vs. cached input vs. output tokens, so you always know what you’ll pay. Tools are pay‑per‑call, letting you scale based on your usage.
Flexibility + Power From quick Q&A with o4‑mini to in‑depth multimodal tasks with o3, or the perfect mid-ground with GPT‑4.1, you can always find the right model for the job.
Ready to Explore?
Get your entire app pipeline running on GPT‑4.1, o4‑mini, or o3 with tool integration—and watch how swiftly you can code, create, or solve complex problems.
Thank you for choosing Requesty, your ultimate alternative to other routing platforms. We can’t wait to see what you build with these new models!