Leaderboard
Multi-dimensional rankings based on model speed tests and provider health checks. Compare providers, endpoints, and reliability at a glance.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | Qwen/Qwen2.5-Coder-7B-Instruct | 0.25 s Best: 0.21Worst: 0.46 | 95.88t/s | 15 | |
| 2 | nvidia/llama-3.3-nemotron-super-49b-v1.5 | 0.28 s Best: 0.16Worst: 0.56 | 79.68t/s |
| 5 |
| 3 | New APInewapi.df-h.com | immersive_translate | 0.31 s Best: 0.23Worst: 0.80 | 127.41t/s | 10 |
| 4 | 智谱AI开放平台open.bigmodel.cn | glm-4.5-air | 0.31 s Best: 0.26Worst: 0.44 | 37.80t/s | 5 |
| 5 | 智谱AI开放平台open.bigmodel.cn | glm-z1-airx | 0.34 s Best: 0.28Worst: 0.45 | 168.24t/s | 5 |
| 6 | 心流apis.iflow.cn | tstars2.0 | 0.37 s Best: 0.34Worst: 0.46 | 94.52t/s | 10 |
| 7 | 1 16693490961329291669349096132929.cn-shanghai.pai-eas.aliyuncs.com | Chery3-32B-0.2 | 0.42 s Best: 0.41Worst: 0.43 | 48.22t/s | 5 |
| 8 | C Chibanbanapi.chibanban.de | zai-org/GLM-4.5-Air | 0.46 s Best: 0.38Worst: 0.55 | 86.82t/s | 5 |
| 9 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4v-flash | 0.47 s Best: 0.36Worst: 0.61 | 52.35t/s | 5 |
| 10 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 0.55 s Best: 0.37Worst: 1.20 | 100.66t/s | 10 |
| 11 | TokenPonyapi.tokenpony.cn | deepseek-v3.2-exp | 0.58 s Best: 0.54Worst: 0.70 | 39.89t/s | 5 |
| 12 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-72b | 0.64 s Best: -Worst: 2.20 | 32.42t/s | 15 |
| 13 | A AI Toolsplatform.aitools.cfd | qwen/qwen3-coder | 0.69 s Best: -Worst: 12.71 | 8.60t/s | 30 |
| 14 | G GPT Loadallaiload.dpdns.org | models/gemini-2.5-flash-lite | 0.71 s Best: 0.59Worst: 1.02 | 259.93t/s | 5 |
| 15 | SkyAIapi.071572.xyz | Qwen/Qwen2.5-Coder-7B-Instruct | 0.72 s Best: 0.62Worst: 1.03 | 26.91t/s | 5 |
| 16 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-vl-32b | 0.73 s Best: -Worst: 2.86 | 16.53t/s | 15 |
| 17 | 飞桨AI Studioaistudio.baidu.com | ernie-4.5-turbo-128k | 0.75 s Best: 0.36Worst: 2.00 | 26.45t/s | 15 |
| 18 | C Chibanbanapi.chibanban.de | tencent/Hunyuan-MT-7B | 0.77 s Best: 0.61Worst: 1.24 | 60.47t/s | 5 |
| 19 | DeepSeekapi.deepseek.com | deepseek-chat | 0.78 s Best: 0.66Worst: 1.02 | 25.48t/s | 5 |
| 20 | RinkoAIrinkoai.com | moonshotai/kimi-k2-instruct-0905 | 0.78 s Best: 0.64Worst: 0.89 | 227.97t/s | 5 |
| 21 | G GPT Loadallaiload.dpdns.org | qwen/qwen3-next-80b-a3b-instruct | 0.79 s Best: 0.70Worst: 0.94 | 154.28t/s | 10 |
| 22 | A AI Toolsplatform.aitools.cfd | google/gemini-2.0-flash-exp | 0.82 s Best: -Worst: 5.27 | 41.22t/s | 25 |
| 23 | RinkoAIrinkoai.com | gpt-oss-120b | 0.84 s Best: 0.70Worst: 1.01 | 590.03t/s | 5 |
| 24 | New API20230621.xyz | gemini-2.5-flash | 0.86 s Best: 0.81Worst: 0.94 | 165.42t/s | 5 |
| 25 | N Newapi502newapi502.087654.xyz | gemma-3-27b-it | 0.87 s Best: 0.68Worst: 1.25 | 0.00t/s | 10 |
| 26 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 0.87 s Best: 0.42Worst: 5.57 | 34.94t/s | 610 |
| 27 | 腾讯云api.hunyuan.cloud.tencent.com | hunyuan-lite | 0.88 s Best: 0.77Worst: 0.99 | 123.45t/s | 5 |
| 28 | Gemini Balancehmcldastbscm.ap-northeast-1.clawcloudrun.com | gemini-flash-lite-latest | 0.88 s Best: 0.72Worst: 1.11 | 381.81t/s | 20 |
| 29 | G GPT Loadallaiload.dpdns.org | DeepSeek-V3-0324 | 0.90 s Best: 0.41Worst: 1.64 | 218.28t/s | 5 |
| 30 | 1 180.76.61.29:3000180.76.61.29:3000 | Qwen/Qwen3-Next-80B-A3B-Instruct | 0.95 s Best: 0.64Worst: 1.29 | 110.28t/s | 5 |
| 31 | ModelScopeapi-inference.modelscope.cn | Qwen/Qwen3-Next-80B-A3B-Instruct | 0.97 s Best: 0.64Worst: 2.13 | 178.62t/s | 5 |
| 32 | a api.modelarts-maas.comapi.modelarts-maas.com | DeepSeek-V3 | 0.98 s Best: 0.77Worst: 1.69 | 30.38t/s | 5 |
| 33 | YUNWU APIyunwu.zeabur.app | gemini-2.5-flash-lite-nothinking | 0.99 s Best: 0.82Worst: 1.49 | 295.30t/s | 5 |
| 34 | YUNWU APIyunwu.ai | gemini-2.5-flash-lite-nothinking | 0.99 s Best: 0.77Worst: 1.29 | 341.00t/s | 5 |
| 35 | DeepSeekapi.deepseek.com | deepseek-chat | 0.99 s Best: 0.66Worst: 1.28 | 23.98t/s | 30 |
| 36 | DashScopedashscope.aliyuncs.com | qwen-mt-turbo | 1.02 s Best: 0.55Worst: 2.74 | 5403.35t/s | 5 |
| 37 | N Newapi502newapi502.087654.xyz | gemini-2.5-pro-nothinking | 1.03 s Best: 0.76Worst: 1.48 | 0.00t/s | 5 |
| 38 | N Newapi502newapi502.087654.xyz | gemini-2.5-flash | 1.06 s Best: 0.70Worst: 1.42 | 0.00t/s | 5 |
| 39 | C Consoleconsole.altr.cc | arc-5-2512 | 1.10 s Best: 0.77Worst: 1.74 | 59.18t/s | 5 |
| 40 | New APItbai.xin | gpt-4.1-mini | 1.18 s Best: 0.92Worst: 1.63 | 69.05t/s | 5 |
| 41 | OpenRouteropenrouter.ai | openrouter/sherlock-dash-alpha | 1.21 s Best: 1.11Worst: 1.54 | 103.39t/s | 10 |
| 42 | S SkyAIskyai.089apis.xyz | inclusionAI/Ling-1T | 1.27 s Best: 0.97Worst: 1.91 | 18.55t/s | 5 |
| 43 | a arkark.cn-beijing.volces.com | deepseek-v3-1-terminus | 1.30 s Best: 0.68Worst: 1.75 | 34.82t/s | 5 |
| 44 | G GPT Loadallaiload.dpdns.org | gpt-oss:120b | 1.35 s Best: 1.07Worst: 1.74 | 141.34t/s | 5 |
| 45 | ChatGTPwww.chatgtp.cn | gemini-2.0-flash | 1.37 s Best: 0.59Worst: 4.04 | 170.98t/s | 5 |
| 46 | G GPT Loadallaiload.dpdns.org | qwen-3-235b-a22b-instruct-2507 | 1.38 s Best: 1.22Worst: 1.74 | 377.69t/s | 5 |
| 47 | G GPT Loadallaiload.dpdns.org | WiNGPT-Babel | 1.40 s Best: 0.66Worst: 3.67 | 137.86t/s | 5 |
| 48 | v veloera.exynos.xyz:8443veloera.exynos.xyz:8443 | arc-5-2512 | 1.44 s Best: 1.29Worst: 1.61 | 83.06t/s | 5 |
| 49 | V Veloeraveloera.exynos.xyz:8443 | arc-5-2512 | 1.44 s Best: 1.29Worst: 1.61 | 83.06t/s | 5 |
| 50 | New APIfanyi.963312.xyz | qwen-3-235b-a22b-instruct-2507 | 1.47 s Best: 0.86Worst: 2.47 | 449.95t/s | 5 |