Requesty
Case Study|Enterprise AI Gateway

How Mozart AI Switches Between 77 Models and 17 Providers with Zero Code Changes

Mozart AI · AI Music Creation & Production Software

77
Models from 17 providers
99.95%+
Success rate, every month
40%
Token costs cut by caching
12x
Volume growth in under a year

Every model, every provider, one endpoint. Requesty lets us move as fast as the model landscape does.

Sundar Arvind
Sundar Arvind
CEO, Mozart AI

About Mozart AI

Mozart AI is the AI music generator built for artists. Its flagship experience, Vibe Sessions, puts an AI co-producer in the browser: describe the genre, mood, tempo, and style you want, and the agent composes, arranges, and refines the track through natural conversation. When artists want hands-on control, they switch to Studio, a full online DAW with piano roll, drum patterns, and mixing, with commercial rights included on everything they create.

Behind the conversational surface is a heavyweight agentic backend. Nearly all of Mozart's LLM traffic is non-streaming backend work, and 40% of requests demand structured JSON output: the agent plans arrangements, edits compositions, and calls tools programmatically.

The Challenge

The model landscape moves fast, and so does Mozart. The best model for the co-producer changes every few months, different tasks suit different models, and every new frontier release is a chance to make the product better. Building on provider APIs directly would have made that agility impossible:

  • Every model switch is a migration. New provider SDKs, new auth, new error handling, regression testing, redeployment. Trying a new model should take minutes, not a sprint.
  • Different tasks, different models. Composition, lightweight classification, vision analysis, and reasoning each have a different best model, often from a different provider.
  • Single-provider fragility. Roughly 4-5% of first-attempt requests fail in any given month. Without fallback, every one of those is a broken session for an artist mid-song.
  • Token economics at scale. Agentic music production is context-heavy. With billions of input tokens per month, paying full price for repeated context would crush margins.
  • Scaling blind. Volume grew 12x in under a year. The team needed per-model, per-provider cost visibility to scale without surprises.

The best model changes every couple of months. We wanted switching to be a config change, not an engineering project, without ever giving up reliability.

Sundar Arvind
Sundar Arvind
CEO, Mozart AI

Why Requesty

One API, every model

Mozart has used 77 distinct models from 17 providers, Anthropic, OpenAI, Google, xAI, and more, through one integration. When a new frontier model ships, the team is testing it in production the same week, with zero code changes.

The right model for every task

Heavyweight composition runs on frontier models, lightweight tasks route to fast, near-free models, and vision and reasoning workloads each get the best tool for the job. In April 2026 alone, Mozart used 32 distinct models across 7 providers.

Routing policies with multi-provider failover

Mozart's primary policy serves its workhorse models across three independent provider infrastructures. If a first attempt fails, Requesty retries on the next provider automatically. Reliability comes from the policy, not from any single vendor.

Automatic prompt caching

Requesty's caching layer maintains a 66% average cache hit rate on Mozart's context-heavy agentic workloads, cutting effective token costs by 40% with zero engineering effort.

Cost tracking per model and provider

As volume scaled 12x, the team compared the real economics of every model and provider side by side, keeping unit costs predictable through hypergrowth.

We point everything at one endpoint. Swapping models is a one-line change, failover is automatic, and we can put every new release head to head against our current stack in production.

Sundar Arvind
Sundar Arvind
CEO, Mozart AI

Reliability, Measured

Model flexibility means nothing if requests fail. The chart below compares first-attempt success on a single provider against eventual success after Requesty's policy fallback, across all Mozart production traffic.

Reliability: Policy Fallback vs Single Provider

Request-level eventual success, Mozart AI production traffic, Jan-May 2026

With Requesty policySingle provider (no fallback)
Policy vs single-provider success rate, Mozart AI Jan-May 202693%94%95%96%97%98%99%100%Jan 2026Feb 2026Mar 2026Apr 2026May 202699.96%99.97%99.95%99.98%100.00%95.48%95.82%95.03%96.10%95.55%

Without policy fallback, 4-5% of first-attempt requests fail monthly. With Requesty, Mozart held 99.95%+ every month, hitting 100.00% in May 2026.

In Their Words

Every model, every provider, one endpoint. Requesty lets us move as fast as the model landscape does.

Sundar Arvind
Sundar Arvind
CEO, Mozart AI