Requesty

Best AI models for reasoning

GPQA Diamond is a set of graduate-level science questions written by domain experts and filtered so that PhD students with internet access still struggle. It's the most reliable signal we have for "does this model actually reason" vs "is it pattern-matching training data".

  1. 🥇
    grok-4
    xAI Corp.·$3.00 / $15.00 per 1M
    87.5%
  2. 🥈
    OpenAI Inc. logo
    gpt-5.4
    OpenAI Inc.·$2.50 / $15.00 per 1M
    86.5%
  3. 🥉
    OpenAI Inc. logo
    gpt-5.2
    OpenAI Inc.·$1.75 / $14.00 per 1M
    84.8%
  4. 4
    Google LLC (Gemini API) logo
    gemini-2.5-pro
    Google LLC (Gemini API)·$1.25 / $10.00 per 1M
    84.0%
  5. 5
    Anthropic PBC logo
    claude-opus-4-7
    Anthropic PBC·$5.00 / $25.00 per 1M
    83.4%
  6. 6
    OpenAI Inc. logo
    o3
    OpenAI Inc.·$2.00 / $8.00 per 1M
    83.3%
  7. 7
    OpenAI Inc. logo
    gpt-5.1
    OpenAI Inc.·$1.25 / $10.00 per 1M
    83.2%
  8. 8
    OpenAI Responses logo
    gpt-5.2-codex
    OpenAI Responses·$1.75 / $14.00 per 1M
    82.1%
  9. 9
    OpenAI Inc. logo
    gpt-5-chat
    OpenAI Inc.·$1.25 / $10.00 per 1M
    81.7%
  10. 10
    Anthropic PBC logo
    claude-opus-4-6
    Anthropic PBC·$5.00 / $25.00 per 1M
    81.2%
  11. 11
    Anthropic PBC logo
    claude-opus-4-5
    Anthropic PBC·$5.00 / $25.00 per 1M
    79.6%
  12. 12
    OpenAI Inc. logo
    o1
    OpenAI Inc.·$15.00 / $60.00 per 1M
    78.0%
  13. 13
    Anthropic PBC logo
    claude-sonnet-4-6
    Anthropic PBC·$3.00 / $15.00 per 1M
    76.8%
  14. 14
    grok-3
    xAI Corp.·$5.00 / $25.00 per 1M
    75.4%
  15. 15
    OpenAI Inc. logo
    o3-mini
    OpenAI Inc.·$1.10 / $4.40 per 1M
    74.8%
  16. 16
    Anthropic PBC logo
    claude-sonnet-4-5
    Anthropic PBC·$3.00 / $15.00 per 1M
    74.2%
  17. 17
    OpenAI Inc. logo
    gpt-4.1
    OpenAI Inc.·$2.00 / $8.00 per 1M
    74.1%
  18. 18
    Together AI Inc. logo
    deepseek-ai/DeepSeek-R1
    Together AI Inc.·$3.00 / $7.00 per 1M
    71.5%
  19. 19
    Anthropic PBC logo
    claude-sonnet-4
    Anthropic PBC·$3.00 / $15.00 per 1M
    70.1%
  20. 20
    Google LLC (Vertex AI) logo
    kimi-k2
    Google LLC (Vertex AI)·$0.60 / $2.50 per 1M
    70.0%
  21. 21
    Google LLC (Gemini API) logo
    gemini-2.5-flash
    Google LLC (Gemini API)·$0.30 / $2.50 per 1M
    68.3%
  22. 22
    Google LLC (Vertex AI) logo
    claude-3-7-sonnet@us-east5
    Google LLC (Vertex AI)·$3.00 / $15.00 per 1M
    65.2%
  23. 23
    OpenAI Inc. logo
    gpt-4.1-mini
    OpenAI Inc.·$0.40 / $1.60 per 1M
    64.8%
  24. 24
    MiniMax-M2
    MiniMax·$0.30 / $1.20 per 1M
    62.5%
  25. 25
    Anthropic PBC logo
    claude-haiku-4-5
    Anthropic PBC·$1.00 / $5.00 per 1M
    62.4%
  26. 26
    Together AI Inc. logo
    deepseek-ai/DeepSeek-V3
    Together AI Inc.·$1.25 / $1.25 per 1M
    59.1%
  27. 27
    OpenAI Inc. logo
    gpt-4o
    OpenAI Inc.·$2.50 / $10.00 per 1M
    53.6%
  28. 28
    Novita AI logo
    meta-llama/llama-3.3-70b-instruct
    Novita AI·$0.39 / $0.39 per 1M
    50.5%
  29. 29
    OpenAI Inc. logo
    gpt-4.1-nano
    OpenAI Inc.·$0.10 / $0.40 per 1M
    47.3%

How we rank

Scores for GPQA Diamond are sourced from official model cards, Artificial Analysis, and public leaderboards. When a model is available through multiple providers (e.g. Anthropic direct, AWS Bedrock, Google Vertex), we show one canonical entry per model family so the ranking isn't polluted by duplicates. Benchmarks measure specific skills — always validate on your own workload before committing.

One API for every model on this list

Requesty is OpenAI-compatible and routes to 400+ models. Switch between any of the models above by changing one parameter in your code.

Get started free