Leaderboard
Multi-dimensional rankings based on model speed tests and provider health checks. Compare providers, endpoints, and reliability at a glance.
Average tokens generated per second. Higher is better for fast responses.
| Rank | Provider | Model | Throughput | Avg first token latency | Total Tests |
|---|---|---|---|---|---|
| 1 | nvidia/nemotron-3-super-120b-a12b | 141869.96 t/s Best: 192828.02Worst: 69462.38 | 29.75s | 5 | |
| 2 | claude-sonnet-4-6 | 27433.35 t/s Best: 35880.34Worst: 16028.38 | 3.78s | 5 |
| 3 | 天 天宫造物c2api.tgzw.shop | gemini-3-flash | 18164.29 t/s Best: 20964.26Worst: 15845.12 | 2.61s | 5 |
| 4 | 天 天宫造物c2api.tgzw.shop | anthropic/claude-sonnet-4.6 | 17475.88 t/s Best: 19734.29Worst: 16106.50 | 2.59s | 5 |
| 5 | DashScopedashscope.aliyuncs.com | MiniMax/MiniMax-M2.5 | 16758.57 t/s Best: 44887.94Worst: 37.55 | 20.81s | 5 |
| 6 | 天 天宫造物c2api.tgzw.shop | google/gemini-3-flash | 16435.04 t/s Best: 18022.56Worst: 14408.37 | 2.62s | 5 |
| 7 | 人 人人 APIllm.whitedream.top | [超低价]claude-opus-4.6 | 3038.71 t/s Best: 6317.46Worst: 72.42 | 25.31s | 10 |
| 8 | 人 人人 APIllm.whitedream.top | [无缓]claude-opus-4-6 | 2393.57 t/s Best: 4816.37Worst: 120.42 | 15.69s | 5 |
| 9 | S Supabasettknrllwjndwdtycqqfv.supabase.co | llama3.1-8b | 2286.66 t/s Best: 3110.22Worst: 1557.57 | 0.58s | 10 |
| 10 | O OminiGenodysseia.jnai2d9kgnbs6xzx5c.com | deepseek-chat | 1483.58 t/s Best: 2652.21Worst: 934.40 | 8.75s | 10 |
| 11 | K Krioraapi.kriora.com | kimi-k2.5 | 1231.67 t/s Best: 1946.34Worst: 793.17 | 1.29s | 5 |
| 12 | S Supabasettknrllwjndwdtycqqfv.supabase.co | qwen-3-235b-a22b-instruct-2507 | 910.42 t/s Best: 1437.04Worst: 558.03 | 0.63s | 5 |
| 13 | 小 小天公益站new-api.xt-url.com | Translation | 378.97 t/s Best: 700.15Worst: 152.75 | 2.56s | 5 |
| 14 | SiliconFlowapi.siliconflow.cn | PaddlePaddle/PaddleOCR-VL-1.5 | 313.91 t/s Best: 492.21Worst: 76.83 | 3.03s | 5 |
| 15 | YUNWU APIyunwu.ai | gpt-5-nano | 263.38 t/s Best: 756.35Worst: 110.70 | 14.07s | 5 |
| 16 | x.aiapi.x.ai | grok-4.20-0309-reasoning | 225.06 t/s Best: 251.82Worst: 189.16 | 3.13s | 5 |
| 17 | integrate.api.nvidia.comintegrate.api.nvidia.com | openai/gpt-oss-120b | 212.52 t/s Best: 268.03Worst: 104.91 | 0.46s | 5 |
| 18 | 天 天宫造物cpa.tgzw.shop | gpt-5.1-codex-mini | 205.07 t/s Best: 461.52Worst: 84.84 | 13.94s | 5 |
| 19 | Tencentapi.lkeap.cloud.tencent.com | minimax-m-2-5 | 165.10 t/s Best: 1337.43Worst: 30.80 | 10.60s | 15 |
| 20 | N New APIai.fengsili.online | askcodi/gemini-3-flash | 157.78 t/s Best: 240.29Worst: 83.05 | 6.18s | 15 |
| 21 | DashScopedashscope.aliyuncs.com | qwen3.5-flash | 150.45 t/s Best: 204.23Worst: 115.87 | 7.34s | 10 |
| 22 | 0 01b332xz5x-11434.cnb.run01b332xz5x-11434.cnb.run | gemma4:e2b | 141.48 t/s Best: 185.39Worst: 24.09 | 6.97s | 20 |
| 23 | YUNWU APIyunwu.ai | gpt-5-nano-2025-08-07 | 139.84 t/s Best: 178.87Worst: 86.33 | 10.26s | 5 |
| 24 | Hugging Facerouter.huggingface.co | Qwen/Qwen3.5-9B | 136.59 t/s Best: 156.10Worst: 123.56 | 7.99s | 5 |
| 25 | Hugging Facerouter.huggingface.co | Qwen/Qwen3.5-9B | 136.59 t/s Best: 156.10Worst: 123.56 | 7.99s | 5 |
| 26 | 哈 哈基米API站api.gemai.cc | claude-opus-4-6 | 112.88 t/s Best: 140.74Worst: 83.67 | 2.19s | 5 |
| 27 | a aiapi.yanami.vipaiapi.yanami.vip | British-Shorthair | 111.75 t/s Best: 167.50Worst: 96.22 | 4.52s | 5 |
| 28 | a api.xiaomimimo.comapi.xiaomimimo.com | mimo-v2-flash | 110.54 t/s Best: 138.72Worst: 72.96 | 0.83s | 5 |
| 29 | SiliconFlowapi.siliconflow.cn | inclusionAI/Ring-flash-2.0 | 108.49 t/s Best: 116.22Worst: 95.76 | 6.43s | 5 |
| 30 | integrate.api.nvidia.comintegrate.api.nvidia.com | qwen/qwen3-next-80b-a3b-thinking | 106.61 t/s Best: 124.07Worst: 89.48 | 12.14s | 5 |
| 31 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen2.5-7B-Instruct | 106.01 t/s Best: 130.23Worst: 86.08 | 0.61s | 5 |
| 32 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4.6v-flash | 104.39 t/s Best: 201.11Worst: 51.39 | 10.30s | 15 |
| 33 | 酒 酒馆无限制免费APInextnewapi.yjie.fun | 酒馆-Flash-New | 101.00 t/s Best: 142.72Worst: 69.31 | 2.38s | 5 |
| 34 | DashScopedashscope.aliyuncs.com | qwen-vl-plus-2025-05-07 | 99.19 t/s Best: 101.15Worst: 97.68 | 0.88s | 5 |
| 35 | free_chatgpt_apifree.v36.cm | gpt-4o-mini-2024-07-18 | 93.87 t/s Best: 106.51Worst: 66.28 | 13.37s | 5 |
| 36 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4.1v-thinking-flash | 93.82 t/s Best: 125.35Worst: 15.13 | 7.04s | 15 |
| 37 | 晴 晴辰云gpt.qt.cool | gpt-5.2-codex | 90.69 t/s Best: 144.35Worst: 40.06 | 1.76s | 5 |
| 38 | SiliconFlowapi.siliconflow.cn | THUDM/GLM-4-9B-0414 | 89.85 t/s Best: 99.65Worst: 82.88 | 0.28s | 5 |
| 39 | Tencentapi.lkeap.cloud.tencent.com | minimax-m2.5 | 89.77 t/s Best: 833.76Worst: 25.24 | 11.58s | 15 |
| 40 | N New APIapi.chlink.de5.net | gpt-5.4 | 89.49 t/s Best: 217.80Worst: 51.67 | 5.48s | 5 |
| 41 | l llm.gankinterview.comllm.gankinterview.com | model-router | 88.42 t/s Best: 137.73Worst: 61.52 | 11.51s | 5 |
| 42 | A AI中转站ai.192700.xyz | gpt-5.4 | 88.14 t/s Best: 160.51Worst: 57.97 | 4.56s | 5 |
| 43 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 87.73 t/s Best: 155.40Worst: 17.55 | 0.94s | 40 |
| 44 | 哈 哈基米API站api.gemai.cc | claude-sonnet-4-6 | 85.69 t/s Best: 147.23Worst: 37.05 | 1.83s | 10 |
| 45 | m modelservice.jdcloud.commodelservice.jdcloud.com | MiniMax-M2.5 | 85.29 t/s Best: 99.26Worst: 63.10 | 17.74s | 5 |
| 46 | SiliconFlowapi.siliconflow.cn | Pro/MiniMaxAI/MiniMax-M2.5 | 85.17 t/s Best: 111.29Worst: 62.78 | 10.57s | 10 |
| 47 | 算 算了么 APIapi.suanli.cn | Qwen/QwQ-32B | 84.50 t/s Best: 107.79Worst: 56.52 | 9.00s | 5 |
| 48 | SiliconFlowapi.siliconflow.cn | stepfun-ai/Step-3.5-Flash | 84.22 t/s Best: 91.73Worst: 75.16 | 3.37s | 5 |
| 49 | r realpics.cn:2234realpics.cn:5001 | gemma-4-26B-A4B-it-UD-IQ4_NL.gguf | 83.64 t/s Best: 87.19Worst: 80.18 | 10.52s | 5 |
| 50 | R Realpicsrealpics.cn:5001 | gemma-4-26B-A4B-it-UD-IQ4_NL.gguf | 83.64 t/s Best: 87.19Worst: 80.18 | 10.52s | 5 |