Groq Inc.

Ultra-fast AI inference with specialized hardware.

📍 🇺🇸 US•5 models available•Visit Website →
5
Available Models
$0.51
Avg Input Price/M
$0.1
Cheapest Model
groq/openai/gpt-oss-20b
$1.00
Most Expensive
groq/moonshotai/Kimi-K2-Instruct-0905

Features Overview

0
Vision Support
0
Advanced Reasoning
1
Caching Support
0
Computer Use

Privacy & Data Policy

Data Retention

No data retention

Location

🇺🇸 US

All Groq Inc. Models

View All Providers →
Groq Inc.

qwen-qwq-32b

Context Window
131K tokens
Max Output
131K tokens
Input
$0.29/M tokens
Output
$0.39/M tokens

Qwen/QwQ-32B is a breakthrough 32-billion parameter reasoning model delivering performance comparable to state-of-the-art (SOTA) models 20x larger like DeepSeek-R1 (671B parameters) on complex reasoning and coding tasks. Deployed on Groq's hardware, it provides the world's fastest and cost-efficient reasoning, producing chains and results in seconds. Along with native tool use support, the 128K context window enables processing extensive information while maintaining comprehensive context.

Context Window
131K tokens
Max Output
33K tokens
Input
$0.1/M tokens
Output
$0.5/M tokens
Context Window
131K tokens
Max Output
33K tokens
Input
$0.15/M tokens
Output
$0.75/M tokens
Caching
Context Window
256K tokens
Max Output
16K tokens
Input
$1.00/M tokens
Output
$3.00/M tokens

Moonshot AI’s cutting‑edge model, moonshotai/Kimi-K2-Instruct-0905, is now live on GroqCloud.

Context Window
131K tokens
Max Output
16K tokens
Input
$1.00/M tokens
Output
$3.00/M tokens

Ready to use Groq Inc. models?

Access all Groq Inc. models through Requesty's unified API with intelligent routing, caching, and cost optimization.

Groq Inc. AI Models - Pricing & Features | Requesty | Requesty