Requesty

Best AI models for tool use and agents

τ²-Bench measures multi-turn agentic tool use: calling functions, following policies, and completing realistic tasks over many turns. If you are building agents or tool-calling workflows, this predicts real-world reliability better than single-shot benchmarks.

  1. 🥇
    Z AI logo
    GLM-5
    Z AI·$1.00 / $3.20 per 1M
    98.2%
  2. 🥈
    Z AI logo
    GLM-5.1
    Z AI·$1.40 / $4.40 per 1M
    97.7%
  3. 🥉
    Alibaba Cloud logo
    qwen3.6-plus
    Alibaba Cloud·$0.50 / $3.00 per 1M
    97.7%
  4. 4
    grok-4.3
    xAI Corp.·$1.25 / $2.50 per 1M
    97.7%
  5. 5
    DeepSeek logo
    deepseek-v4-pro
    DeepSeek·$0.43 / $0.87 per 1M
    96.2%
  6. 6
    Moonshot AI logo
    kimi-k2.6
    Moonshot AI·$0.95 / $4.00 per 1M
    95.9%
  7. 7
    Moonshot AI logo
    kimi-k2.5
    Moonshot AI·$0.60 / $3.00 per 1M
    95.9%
  8. 8
    Z AI logo
    GLM-4.7
    Z AI·$0.60 / $2.20 per 1M
    95.9%
  9. 9
    Google LLC (Gemini API) logo
    gemini-3.1-pro-preview
    Google LLC (Gemini API)·$2.00 / $12.00 per 1M
    95.6%
  10. 10
    Novita AI logo
    qwen/qwen3.5-397b-a17b
    Novita AI·$0.60 / $3.60 per 1M
    95.6%
  11. 11
    MiniMax-M2.5
    MiniMax·$0.30 / $1.20 per 1M
    95.3%
  12. 12
    Google LLC (Vertex AI) logo
    gemini-3.5-flash
    Google LLC (Vertex AI)·$1.50 / $9.00 per 1M
    95.3%
  13. 13
    DeepSeek logo
    deepseek-v4-flash
    DeepSeek·$0.14 / $0.28 per 1M
    95.0%
  14. 14
    Novita AI logo
    xiaomimimo/mimo-v2-pro
    Novita AI·$2.00 / $6.00 per 1M
    95.0%
  15. 15
    Novita AI logo
    xiaomimimo/mimo-v2-flash
    Novita AI·$0.10 / $0.30 per 1M
    95.0%
  16. 16
    Alibaba Cloud logo
    qwen3.7-max
    Alibaba Cloud·$2.50 / $7.50 per 1M
    94.7%
  17. 17
    Anthropic PBC logo
    claude-opus-4-8
    Anthropic PBC·$5.00 / $25.00 per 1M
    94.4%
  18. 18
    Mistral AI SAS logo
    mistral-medium-3-5
    Mistral AI SAS·$1.50 / $7.50 per 1M
    94.2%
  19. 19
    DeepInfra Inc. logo
    XiaomiMiMo/MiMo-V2.5-Pro
    DeepInfra Inc.·$1.00 / $3.00 per 1M
    94.2%
  20. 20
    OpenAI Inc. logo
    gpt-5.5
    OpenAI Inc.·$5.00 / $30.00 per 1M
    93.9%
  21. 21
    DeepInfra Inc. logo
    Qwen/Qwen3.5-27B
    DeepInfra Inc.·$0.26 / $2.60 per 1M
    93.9%
  22. 22
    Google LLC (Vertex AI) logo
    kimi-k2
    Google LLC (Vertex AI)·$0.60 / $2.50 per 1M
    93.0%
  23. 23
    Novita AI logo
    inclusionai/ring-2.6-1t
    Novita AI·$0.30 / $2.50 per 1M
    92.4%
  24. 24
    Anthropic PBC logo
    claude-opus-4-6
    Anthropic PBC·$5.00 / $25.00 per 1M
    92.1%
  25. 25
    OpenAI Responses logo
    gpt-5.2-codex
    OpenAI Responses·$1.75 / $14.00 per 1M
    92.1%
  26. 26
    Google LLC (Vertex AI) logo
    deepseek-v3.2
    Google LLC (Vertex AI)·$0.56 / $1.68 per 1M
    90.6%
  27. 27
    grok-3-mini
    xAI Corp.·$0.30 / $0.50 per 1M
    90.4%
  28. 28
    Novita AI logo
    inclusionai/ling-2.6-1t
    Novita AI·$0.30 / $2.50 per 1M
    89.8%
  29. 29
    Anthropic PBC logo
    claude-opus-4-5
    Anthropic PBC·$5.00 / $25.00 per 1M
    89.5%
  30. 30
    DeepInfra Inc. logo
    Qwen/Qwen3.5-35B-A3B
    DeepInfra Inc.·$0.14 / $1.00 per 1M
    89.2%

How we rank

Scores for τ²-Bench come from Artificial Analysis, an independent AI benchmarking service. When a model is available through multiple providers (e.g. Anthropic direct, AWS Bedrock, Google Vertex), we show one canonical entry per model family so the ranking isn't polluted by duplicates. Benchmarks measure specific skills — always validate on your own workload before committing.

One API for every model on this list

Requesty is OpenAI-compatible and routes to 400+ models. Switch between any of the models above by changing one parameter in your code.

Get started free