Leaderboard
Multi-dimensional rankings based on model speed tests and provider health checks. Compare providers, endpoints, and reliability at a glance.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | qwen/qwen3-coder | 0.05 s Best: -Worst: 1.88 | 2.48t/s | 35 | |
| 2 | moonshotai/kimi-k2 | 0.16 s Best: -Worst: 4.47 | 2.03t/s | 75 |
| 3 | N Neo APIapi.chiban.de | glm-4-flashx | 0.24 s Best: -Worst: 0.64 | 43.17t/s | 5 |
| 4 | Mistral AIapi.mistral.ai | ministral-3b-2410 | 0.41 s Best: 0.35Worst: 0.53 | 210.88t/s | 5 |
| 5 | Mistral AIapi.mistral.ai | ministral-3b-2410 | 0.41 s Best: 0.35Worst: 0.53 | 210.88t/s | 5 |
| 6 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4v-flash | 0.43 s Best: 0.26Worst: 1.31 | 53.51t/s | 15 |
| 7 | integrate.api.nvidia.comintegrate.api.nvidia.com | deepseek-ai/deepseek-v3.1 | 0.45 s Best: 0.33Worst: 0.89 | 29.04t/s | 10 |
| 8 | integrate.api.nvidia.comintegrate.api.nvidia.com | qwen/qwen3-next-80b-a3b-instruct | 0.50 s Best: 0.41Worst: 0.63 | 158.38t/s | 5 |
| 9 | SiliconFlowapi.siliconflow.cn | THUDM/glm-4-9b-chat | 0.56 s Best: 0.49Worst: 0.71 | 71.30t/s | 5 |
| 10 | SiliconFlowapi.siliconflow.cn | internlm/internlm2_5-7b-chat | 0.57 s Best: 0.54Worst: 0.61 | 68.19t/s | 10 |
| 11 | A AI Toolsplatform.aitools.cfd | google/gemini-2.0-flash-exp | 0.59 s Best: -Worst: 4.75 | 51.61t/s | 95 |
| 12 | C Cerebras Sandboxv.ag-api.eu.cc | qwen-3-235b-a22b-instruct-2507 | 0.60 s Best: 0.29Worst: 1.39 | 703.17t/s | 15 |
| 13 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen3-Next-80B-A3B-Instruct | 0.62 s Best: 0.54Worst: 0.68 | 68.32t/s | 5 |
| 14 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen3-VL-8B-Instruct | 0.64 s Best: 0.53Worst: 0.68 | 142.73t/s | 5 |
| 15 | u us.jianxiaoru.sbs:5005us.jianxiaoru.sbs:5005 | Qwen3-14B | 0.66 s Best: 0.59Worst: 0.83 | 22.01t/s | 5 |
| 16 | C Cerebras Sandboxv.ag-api.eu.cc | qwen-3-coder-480b | 0.66 s Best: 0.33Worst: 1.54 | 648.92t/s | 5 |
| 17 | A AI Toolsplatform.aitools.cfd | deepseek/deepseek-r1-0528 | 0.67 s Best: -Worst: 10.00 | 4.30t/s | 65 |
| 18 | DashScopedashscope.aliyuncs.com | qwen-plus | 0.70 s Best: 0.54Worst: 1.02 | 47.17t/s | 5 |
| 19 | RinkoAIrinkoai.com | Qwen/Qwen3-235B | 0.71 s Best: 0.58Worst: 0.96 | 583.68t/s | 5 |
| 20 | O ORBIAIapi.orbiai.cloud | gemini-2.0-flash | 0.74 s Best: 0.69Worst: 0.80 | 153.26t/s | 5 |
| 21 | OpenRouteropenrouter.ai | qwen/qwen3-vl-32b-instruct | 0.76 s Best: 0.48Worst: 1.53 | 44.94t/s | 5 |
| 22 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 0.80 s Best: 0.62Worst: 1.53 | 90.79t/s | 25 |
| 23 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 0.82 s Best: 0.36Worst: 4.89 | 35.81t/s | 530 |
| 24 | New APIfanyi.963312.xyz | llama-4-scout-17b-16e-instruct | 0.84 s Best: 0.80Worst: 0.89 | 334.77t/s | 5 |
| 25 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen2.5-VL-72B-Instruct | 0.86 s Best: 0.74Worst: 1.11 | 26.42t/s | 5 |
| 26 | 共绩算力d10071955-ollama-webui-qwen3v2-5477-ykrvqzya-11434.550c.cloud | qwen3:32b | 0.88 s Best: 0.53Worst: 2.24 | 37.26t/s | 5 |
| 27 | d d10071955-ollama-webui-qwen3v2-5477-ykrvqzya-11434.550c.cloudd10071955-ollama-webui-qwen3v2-5477-ykrvqzya-11434.550c.cloud | qwen3:32b | 0.88 s Best: 0.53Worst: 2.24 | 37.26t/s | 5 |
| 28 | d d10071955-ollama-webui-qwen3v2-5477-ykrvqzya-11434.550c.cloudd10071955-ollama-webui-qwen3v2-5477-ykrvqzya-11434.550c.cloud | qwen3:32b | 0.88 s Best: 0.53Worst: 2.24 | 37.26t/s | 5 |
| 29 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-9b | 0.88 s Best: 0.61Worst: 1.40 | 58.83t/s | 5 |
| 30 | RinkoAIrinkoai.com | Qwen/Qwen3-Coder-480B | 0.89 s Best: 0.69Worst: 1.43 | 552.48t/s | 5 |
| 31 | RinkoAIrinkoai.com | moonshotai/kimi-k2-instruct-0905 | 0.89 s Best: 0.52Worst: 2.21 | 216.36t/s | 30 |
| 32 | SiliconFlowapi.siliconflow.cn | inclusionAI/Ling-1T | 0.93 s Best: 0.79Worst: 1.01 | 6.44t/s | 5 |
| 33 | 一 一叶知秋API88996.cloud | gemini-flash-lite-latest | 0.95 s Best: 0.85Worst: 1.10 | 326.55t/s | 5 |
| 34 | ModelScopeapi-inference.modelscope.cn | Qwen/Qwen3-Next-80B-A3B-Instruct | 0.95 s Best: 0.69Worst: 1.91 | 139.29t/s | 5 |
| 35 | RinkoAIrinkoai.com | moonshotai/kimi-k2-instruct-0905 | 0.95 s Best: 0.77Worst: 1.16 | 217.81t/s | 5 |
| 36 | SiliconFlowapi.siliconflow.cn | Pro/deepseek-ai/DeepSeek-V3.2-Exp | 0.98 s Best: 0.77Worst: 1.11 | 24.18t/s | 5 |
| 37 | 腾讯云api.hunyuan.cloud.tencent.com | hunyuan-lite | 0.99 s Best: 0.86Worst: 1.11 | 133.29t/s | 5 |
| 38 | DeepSeekapi.deepseek.com | deepseek-chat | 0.99 s Best: 0.70Worst: 1.49 | 25.69t/s | 15 |
| 39 | DashScopedashscope.aliyuncs.com | qwen3-vl-flash | 1.00 s Best: 0.61Worst: 2.50 | 147.58t/s | 5 |
| 40 | DashScopedashscope.aliyuncs.com | qwen-turbo-latest | 1.01 s Best: 0.57Worst: 2.49 | 84.44t/s | 5 |
| 41 | ChatGTPwww.chatgtp.cn | deepseek-v3-1-250821 | 1.02 s Best: 0.88Worst: 1.23 | 66.69t/s | 5 |
| 42 | Mistral AIapi.mistral.ai | mistral-large-latest | 1.08 s Best: 0.42Worst: 1.54 | 65.93t/s | 5 |
| 43 | Mistral AIapi.mistral.ai | mistral-large-latest | 1.08 s Best: 0.42Worst: 1.54 | 65.93t/s | 5 |
| 44 | SiliconFlowapi.siliconflow.cn | deepseek-ai/DeepSeek-V3.2-Exp | 1.11 s Best: 0.87Worst: 1.75 | 23.69t/s | 5 |
| 45 | SiliconFlowapi.siliconflow.cn | deepseek-ai/DeepSeek-V3 | 1.14 s Best: 0.94Worst: 1.31 | 23.60t/s | 5 |
| 46 | DashScopedashscope.aliyuncs.com | qwen3-max-2025-09-23 | 1.14 s Best: 0.59Worst: 2.53 | 24.49t/s | 5 |
| 47 | DashScopedashscope.aliyuncs.com | qwen3-235b-a22b-instruct-2507 | 1.15 s Best: 0.57Worst: 2.92 | 24.80t/s | 5 |
| 48 | RinkoAIrinkoai.com | Qwen/Qwen3-14B | 1.19 s Best: 0.97Worst: 1.68 | 38.30t/s | 5 |
| 49 | A AI Toolsplatform.aitools.cfd | openai/gpt-oss-20b | 1.19 s Best: -Worst: 6.10 | 54.46t/s | 50 |
| 50 | DeepSeekapi.deepseek.com | deepseek-chat | 1.19 s Best: 0.89Worst: 1.61 | 25.22t/s | 10 |