Requesty
Back|MAY '26SECURITY / REQUESTY FEATURES
9 MIN READ|

Give every team exactly the models they need (and nothing more)

Thibault Jaigu
Thibault Jaigu
CEO & Co-Founder
Last updated

Requesty's model governance stack solves the problem every platform team hits once they pass five engineers using LLMs: who gets access to what, and how do you enforce it without becoming a bottleneck? The answer is three layers that compose. Approved Models sets the org wide floor. Access Lists carve it into team shaped slices. Expiring API keys ensure credentials rotate without manual intervention. This post walks through how to wire them up together.

Full feature docs: Approved Models and Access Lists.

The problem you hit at scale

Small teams share one API key and nobody cares which model gets called. Then reality arrives:

  1. The intern calls GPT 5 in a loop and burns $4,000 overnight on a summarization experiment that could have used a $0.10/M token model.
  2. A compliance requirement lands saying customer data can only touch EU resident models, but half your keys route everywhere.
  3. Production breaks because someone switched the model field in a config file to a reasoning model that takes 40 seconds per request.
  4. An old key leaks in a public repo and nobody knows which team owned it or what it could access.

Each of these is a governance failure. The traditional answer is "just be careful" or "put it in a wiki." The actual answer is policy as configuration at the gateway layer.

Layer 1: Approved Models (the org wide whitelist)

This is the broadest control. From the Admin Panel you define which models exist for your organization. Everything not on this list is invisible to your members and inaccessible to your keys.

What it controlsHow it works
Model visibility in dashboardNon approved models hidden from Model Library and Chat
GET /v1/models responseOnly returns approved models for the calling key
Routing policy targetsFallback chains and load balancers only pick from approved set
New model releasesNot approved by default, requires explicit admin action

Quick start presets

If you are just getting started, Requesty ships one click presets:

PresetWhat it approves
EU OnlyModels hosted in EU data centers exclusively
EU + ZDREU models plus those with Zero Data Retention policies
US OnlyUS hosted models
ZDR OnlyAny model with Zero Data Retention regardless of region

Pick a preset, then fine tune. Remove what you do not need, add specific models from other regions if your compliance posture allows it.

The design choice: deny by default

New models that appear in the Requesty catalog are not automatically approved. This is the opposite of how most teams operate (where everything is allowed until someone complains). Deny by default means your security posture never degrades silently. You review new models on your schedule and approve them deliberately.

This pattern mirrors how Cloudflare's AI Gateway handles rate limiting at the infrastructure layer, and how LiteLLM's proxy enforces model scoping per virtual key. The difference is Requesty combines all three controls (model whitelist, team scoping, and key expiration) in a single managed surface rather than requiring you to wire up separate systems.

Layer 2: Access Lists (team shaped subsets)

Approved Models answers "what can the org use?" Access Lists answer "what can this specific team or workload use?"

An access list is a named, reusable collection of model IDs. You create it once, then attach it to groups (for team wide policies) or directly to API keys (for workload specific restrictions).

How the hierarchy resolves

When a request arrives, Requesty checks these layers in order. The first non empty layer wins:

Text
API key's own access list
        ↓ (if none)
Union of access lists from the key's groups
        ↓ (if none)
Organization Approved Models
        ↓ (if nothing configured)
Full catalog (everything allowed)

This means you can set broad permissions at the org level and progressively tighten for specific teams or production workloads without touching the org config.

Real world example: three teams, one org

Imagine you have approved 40 models at the org level. Here is how you carve that into team policies:

TeamAccess list nameModels includedAttached to
Frontend (chat features)Chat Productionopenai/gpt-4.1-mini, anthropic/claude-haiku-4-5Group: Frontend Eng
Data Science (experimentation)Research Wide30 models across all providersGroup: Data Science
Production AgentsAgent Strictanthropic/claude-sonnet-4-5, openai/gpt-4.1Directly on 3 API keys
Customer Support BotSupport EUvertex/gemini-2.5-flash, anthropic/claude-haiku-4-5Group: Support Ops

The data science team can experiment freely across 30 models. The production agent keys can only ever call two. The support bot is locked to EU resident models. All from one admin panel, no application code changes.

Creating and attaching (step by step)

Create the list:

  1. Admin Panel → Access Lists → Create Access List
  2. Name it clearly (e.g. "Production Agents Q2 2026")
  3. Search and select models by provider or name
  4. Save

Attach to a group:

  1. Admin Panel → Groups → expand target group
  2. Access List section → Manage → select from dropdown
  3. Save. Every key in that group immediately inherits the restriction.

Attach to a specific key:

  1. Admin Panel → API Keys → select the key (or select multiple for bulk)
  2. Action bar → Attach Access List → pick from dropdown
  3. Done. This overrides the group list for that key only.

What developers see

When a developer calls GET /v1/models with their key, they only see models their key is allowed to use. Tools like Claude Code, Cursor, GitHub Copilot, and Open WebUI all call this endpoint to populate their model dropdowns. The restriction is invisible in the best way: developers never see models they cannot use, so there is no confusion, no failed requests, no Slack threads asking "why did my request 403?"

If someone does manage to hardcode a model ID that is not on their list, the request fails with provider violates policy and is never forwarded to the upstream provider. Zero data leaves your gateway.

Layer 3: Expiring API keys (enforced rotation)

The third piece of the governance stack is temporal. Even with perfect model access controls, a key that lives forever is a key that eventually leaks. Requesty supports key expiration: you set a date, the key auto revokes at that time, and your security log records the event.

Why expiration matters for team governance

ScenarioWithout expirationWith expiration
Contractor engagement endsKey lives on, forgottenKey dies on contract end date
Quarterly security rotationSomeone files a ticket, maybeAutomatic, zero human action
Hackathon or POCTemporary key becomes permanentKey expires Monday morning
Incident responseRevoke all keys manuallyExpired keys already dead

This pattern is standard in mature credential management. AWS IAM temporary credentials, GitHub fine grained tokens with expiry, and GCP service account key rotation all enforce the same principle: credentials should have a natural death. Requesty brings that same discipline to LLM access keys.

Pairing expiration with access lists

The combination is powerful. A contractor gets a key that:

  1. Expires in 90 days (matches their contract)
  2. Has an access list limiting them to 3 models (matches their workload)
  3. Has a monthly spend limit of $500 (matches their budget)

Three constraints, one key, zero ongoing admin work. When the contract ends the key dies on its own. No offboarding ticket, no forgotten revocation.

Putting it all together: a governance playbook

Here is the sequence for a platform engineer setting this up from scratch:

Step 1: Set your org wide floor

Go to Approved Models. Start with a preset if compliance dictates a region, otherwise approve the models your teams have asked for. Be generous here. This is the ceiling, not the assignment.

Step 2: Create access lists per policy boundary

Think in terms of policies, not teams. A team might need different policies for different workloads:

PolicyModelsRationale
Cheap and fastgpt-4.1-mini, claude-haiku-4-5, gemini-2.5-flashHigh volume, low cost workloads
Frontier reasoningo3, claude-sonnet-4-5Complex tasks that justify the cost
EU compliantvertex/gemini, any ZDR modelCustomer data workloads under GDPR
Production lockedExactly 2 modelsStable, tested, no surprises

Step 3: Map policies to groups and keys

Attach the broad lists to groups. Attach the strict lists directly to production keys. The hierarchy handles the rest.

Step 4: Set key expiration

For every key that serves a temporary purpose (contractor, POC, hackathon, staging environment), set an expiration date at creation time. For production keys, set quarterly expiration and rotate on schedule.

Step 5: Review monthly

New models ship constantly. Review the Approved Models list when a new release drops. Check Usage Analytics to find models that are approved but unused (remove them to reduce surface area). Audit which keys have expired and confirm replacements are in place.

What this replaces

Without a gateway governance layer, teams cobble together:

ApproachProblem
One shared API key for everyoneNo attribution, no access control, catastrophic blast radius on leak
Per provider key managementN providers × M teams = NM keys to track, no unified policy
Application code checksScattered, inconsistent, bypassable, not auditable
Honor system wiki pagesNobody reads wikis under deadline pressure
Manual Slack approvalsBottleneck on one person, no audit trail

Requesty replaces all five with configuration. The admin panel is the single source of truth. The API enforces it. The audit log proves it.

Cross referencing: how other platforms handle this

The pattern of inserting a governance layer between teams and model providers is becoming industry standard:

LiteLLM Proxy takes the open source, self hosted approach. Virtual keys scoped to teams with model routing aliases give platform engineers full control, but require you to run and maintain the proxy yourself.

Cloudflare AI Gateway focuses on the infrastructure angle (rate limiting, caching, observability) but leaves fine grained team RBAC to other tools.

Portkey offers a commercial gateway with organization level credential sharing and model allow lists per workspace.

Requesty's differentiator is combining the model whitelist, the named access list hierarchy, key expiration, and the guardrails layer into one managed surface. You do not need to compose three tools to get governance. And because Requesty is the router itself (not a proxy in front of another proxy), the access control is zero latency overhead on the routing decision.

TL;DR

  1. Approved Models is your org wide whitelist. New models are denied by default. Start broad.
  2. Access Lists are named subsets you attach to groups or keys. They narrow the whitelist per team or per workload without touching org config.
  3. Resolution order: key list beats group union beats org approved beats full catalog.
  4. Expiring keys enforce rotation automatically. Set an expiry at creation, the key auto revokes on schedule.
  5. Developers see nothing they cannot use. GET /v1/models respects the resolved list, so tool dropdowns are always correct.
  6. Zero application code. All governance lives in the admin panel and applies at the gateway.
  7. Docs: Approved Models | Access Lists

Frequently asked questions

What is the difference between Approved Models and Access Lists?
Approved Models is the organization wide whitelist. It defines the broadest set of models anyone in your org can use. Access Lists are named subsets of that whitelist that you attach to specific groups or API keys to narrow access further. Think of Approved Models as the ceiling and Access Lists as the room dividers.
Can one API key belong to multiple groups with different access lists?
Yes. When a key belongs to multiple groups that each have an access list, Requesty takes the union of all group lists. The key can use any model that appears in at least one of its groups' lists. If you need to restrict further, attach a list directly to the key itself, which overrides the group union entirely.
What happens when I remove a model from an access list that a team is actively using?
The change takes effect immediately. Any subsequent request targeting that model from a key governed by that list will receive a provider violates policy error. The request is never forwarded upstream. Check Usage Analytics before removing to confirm the model is not in active use.
Do expiring API keys delete themselves or just stop working?
They stop working. The key is auto revoked at the expiration time, meaning requests return an authentication error, but the key record remains visible in the Admin Panel for audit. You can see expired keys in the security log and confirm the rotation happened.
How does this compare to managing access at the provider level directly?
Managing per provider means maintaining separate key sets, separate policies, and separate audit trails for every provider you use. Requesty collapses that into one governance layer regardless of whether you route to OpenAI, Anthropic, Google, or all three. One access list can contain models from five providers and still be managed as a single policy.
Related reading