Question 1

How many NVIDIA models are available through Requesty?

Accepted Answer

Requesty routes to 5 NVIDIA models including regional variants, with pricing synced in real time to the upstream provider.

Question 2

What is the cheapest NVIDIA model?

Accepted Answer

NVIDIA has free tiers available: look for the models marked "Free" in the pricing column.

Question 3

Does Requesty add markup on NVIDIA pricing?

Accepted Answer

No. Requesty passes through exactly what NVIDIA charges. You pay the same per-token rates as going direct, plus you get smart routing, caching, analytics, and one unified API for 600+ models.

Question 4

Is my data used to train NVIDIA models?

Accepted Answer

NVIDIA's default terms may include data use for training. Check their privacy policy and Requesty's enterprise options for opt-out controls.

Question 5

Where are NVIDIA models hosted?

Accepted Answer

NVIDIA models are hosted in 🇺🇸 US. Some models are available in additional regions through AWS Bedrock, Azure, or Google Vertex AI: filter by region on the NVIDIA rows in the models explorer.

Model	Context	Max Output	Input/1M	Output/1M	Capabilities	Coding
nemotron-3.5-content-safety	131K	8K	Free	Free	👁🧠	N/A
nemotron-3-nano-30b-a3b	262K	—	Free	Free	🧠🔧	14
nemotron-3-super-120b-a12b	1.0M	66K	Free	Free	🧠🔧	38
nemotron-3-ultra-550b-a55b	1.0M	66K	Free	Free	🧠🔧	49
nemotron-3-nano-omni-30b-a3b-reasoning	131K	20K	Free	Free	👁🧠🔧	N/A

NVIDIA Models

All NVIDIA models

About NVIDIA on Requesty