Alerts are live on Requesty. Configure a webhook and get notified the moment a user, team, or organisation crosses a spend threshold — no application code, no polling. Four alert types ship today: per-user percentage of budget, per-user absolute spend, per-group percentage of budget, and org balance below a floor. Delivery is via Slack or generic JSON webhook, with automatic retries. Docs here.
This is the feature teams asked for most in 2025: observability that pings you instead of requiring you to log in and look.
What it does
You configure a threshold. When it's crossed, Requesty POSTs a JSON payload to the webhook URL you specified. That's it — no dashboards to check, no cron jobs to run, no custom metrics pipeline. The gateway already knows what every key is spending; alerts just turn that into a push notification.
| Alert type | Fires when | Use case |
|---|---|---|
| User % of Budget | A user reaches X% of their monthly limit | Warn a team member at 80%, page at 100% |
| User Absolute Spend | A user crosses $X, independent of budget | Catch runaway keys even without budgets set |
| Group % of Budget | A group's combined spend crosses X% of group budget | Team-level awareness before the CFO notices |
| Org Balance Below | Organisation credits drop under $X | Top-up trigger for prepaid accounts |
Setup (90 seconds)
- Admin Panel → Alerts → Add Webhook. Pick
SlackorJSON. Paste the URL. Save. - Add Alert. Pick the type, enter the threshold, confirm.
- Done. The next threshold crossing fires a webhook.
Example payload
For a generic JSON webhook:
{
"type": "user.budget.exceeded_percent",
"user": {
"email": "alex@growth-team.com",
"id": "u_28419"
},
"group": "growth",
"threshold": {
"kind": "percent",
"value": 80,
"budget": 2000.0,
"current_spend": 1612.43
},
"fired_at": "2026-03-18T14:22:11Z"
}If your webhook endpoint is down or slow, Requesty retries up to 3 times with exponential backoff, with a 15-second timeout per attempt. Alerts don't silently vanish if your Slack goes down.
Why this pairs with labels
We shipped labels on API keys last month. Labels attribute spend to a team, feature, or customer; alerts tell you the moment any of those cross a line. Together, you get a closed loop:
- Label keys by
team/feature/env/tier - Set
monthly_limitper key - Configure a User % of Budget alert at 80%
- Get pinged in Slack the moment any labeled key is trending over
The team on the receiving end of that Slack message knows — from the label — exactly which feature is overspending, before their finance partner has to ask.
What's next
This first release covers spending only. Latency and error-rate alerts are on the roadmap — next wave adds:
policy.fallback_escalation_rate— alert when a fallback chain is escalating more than usual (a provider is struggling)latency.p95_regression— alert when p95 latency on a policy rises by more than X% week-on-weekrequest.error_rate— alert when the 5xx rate on a specific model climbs above a threshold
If any of those would save you a Monday morning, mention it in the Discord — priority order reflects what users are asking for.
TL;DR
- Alerts are live for four spend-based events
- Webhooks only (Slack or generic JSON), 3 retries with backoff
- Pairs with API key labels for per-team / per-feature / per-customer attribution
- Setup takes 90 seconds, no application code required
- Docs: requesty.ai/features/alerts
Frequently asked questions
- What is Requesty Alerts?
- Requesty Alerts is a webhook-based notification system that fires when a user, group, or organisation crosses a spend threshold. Four alert types are supported: per-user percentage of budget, per-user absolute dollar spend, per-group percentage of budget, and organisation balance below a threshold. Delivery is via Slack or generic JSON webhook.
- How do I set up an alert?
- Go to Admin Panel → Alerts, configure a webhook URL (Slack incoming webhook or any HTTPS JSON endpoint), click Add Alert, pick the alert type, enter the threshold, save. No application code change needed.
- What gets sent in the webhook payload?
- A JSON object with an event type (e.g. user.budget.exceeded_percent), the user email or group name, and the threshold that was crossed. Slack webhooks receive a pre-formatted message. Full payload schema is in the Alerts docs.
- Are failed webhook deliveries retried?
- Yes — up to 3 retries with exponential backoff. Each attempt has a 15-second timeout. If your webhook endpoint is down, the alert won't silently vanish.
- Can I alert on latency or errors, not just spend?
- Not yet. The first release covers spending only — the four thresholds above. Latency and error-rate alerts are on the roadmap.
- FEB '26
Label your API keys: the cost-attribution trick most teams miss
Requesty API keys carry arbitrary key-value labels. That one feature unlocks per-team, per-feature, per-customer spend attribution without a single line of instrumentation code. Here's the pattern.
- JAN '26
Routing policies 101: fallback, load balancing, and latency in production
The three routing-policy primitives every LLM gateway needs — fallback chains, weighted load balancing, and latency-based selection — and when to use each. Written for teams deploying multi-model production setups.
- APR '26
Agentic routing, benchmarked: Requesty adds 16ms of overhead, OpenRouter adds 55ms
Agentic routing is the decision layer inside a multi-agent LLM system that picks which model or sub-agent handles an incoming request. Here's what it does, what it costs, and how the gateways compare.

