Leaderboard
Multi-dimensional rankings based on model speed tests and provider health checks. Compare providers, endpoints, and reliability at a glance.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | glm-4-flash-250414 | 0.29 s Best: 0.21Worst: 0.56 | 39.29t/s | 5 | |
| 2 | openrouter/quasar-alpha | 0.35 s Best: -Worst: 1.50 | 0.00t/s | 30 |
| 3 | Mistral AImistral.ai | codestral-latest | 0.38 s Best: 0.34Worst: 0.46 | 151.00t/s | 5 |
| 4 | 1 112.6.115.158:52345112.6.115.158:52345 | qwen | 0.42 s Best: 0.36Worst: 0.63 | 29.76t/s | 5 |
| 5 | 智谱AI开放平台open.bigmodel.cn | glm-z1-flash | 0.43 s Best: 0.35Worst: 0.50 | 116.46t/s | 5 |
| 6 | 智谱AI开放平台open.bigmodel.cn | glm-4-flash-250414 | 0.46 s Best: 0.28Worst: 0.66 | 48.96t/s | 5 |
| 7 | 2 211.90.240.240:30001211.90.240.240:30001 | Qwen2.5-32B-Instruct | 0.49 s Best: 0.37Worst: 0.95 | 40.80t/s | 35 |
| 8 | integrate.api.nvidia.comintegrate.api.nvidia.com | deepseek-ai/deepseek-r1-distill-qwen-14b | 0.49 s Best: 0.36Worst: 0.64 | 40.96t/s | 5 |
| 9 | J Joyueai.joyue.joyuerpa.com:3001 | qwen2.5-instruct | 0.56 s Best: 0.46Worst: 0.86 | 48.40t/s | 5 |
| 10 | a ai.joyue.joyuerpa.com:3001ai.joyue.joyuerpa.com:3001 | qwen2.5-instruct | 0.56 s Best: 0.46Worst: 0.86 | 48.40t/s | 5 |
| 11 | a ai.joyue.joyuerpa.com:3001ai.joyue.joyuerpa.com:3001 | Qwen2.5-32B-Instruct | 0.57 s Best: 0.49Worst: 0.87 | 37.02t/s | 5 |
| 12 | J Joyueai.joyue.joyuerpa.com:3001 | Qwen2.5-32B-Instruct | 0.57 s Best: 0.49Worst: 0.87 | 37.02t/s | 5 |
| 13 | a ai.xinference.joyuerpa.com:3001ai.xinference.joyuerpa.com:3001 | qwen2.5-instruct | 0.59 s Best: 0.45Worst: 1.31 | 48.56t/s | 10 |
| 14 | X Xinferenceai.xinference.joyuerpa.com:3001 | qwen2.5-instruct | 0.59 s Best: 0.45Worst: 1.31 | 48.56t/s | 10 |
| 15 | I IdeaLabidealab.alibaba-inc.com | claude37_sonnet | 0.59 s Best: 0.38Worst: 1.40 | 2766.82t/s | 5 |
| 16 | New APIapi.lianwusuoai.top | Pro/Qwen/Qwen2-1.5B-Instruct | 0.60 s Best: 0.53Worst: 0.87 | 204.04t/s | 5 |
| 17 | a ai.xinference.joyuerpa.com:3001ai.xinference.joyuerpa.com:3001 | qwen2-instruct | 0.61 s Best: 0.47Worst: 0.98 | 47.93t/s | 10 |
| 18 | X Xinferenceai.xinference.joyuerpa.com:3001 | qwen2-instruct | 0.61 s Best: 0.47Worst: 0.98 | 47.93t/s | 10 |
| 19 | 1 182.92.169.157:13090182.92.169.157:13090 | qwen3:30b-a3b-q8_0 | 0.61 s Best: 0.58Worst: 0.65 | 43.61t/s | 5 |
| 20 | New APIapi.lianwusuoai.top | internlm/internlm2_5-7b-chat | 0.61 s Best: 0.54Worst: 0.84 | 73.88t/s | 5 |
| 21 | J Joyueai.joyue.joyuerpa.com:3001 | deepseek-r1:32b | 0.62 s Best: 0.45Worst: 1.59 | 39.64t/s | 15 |
| 22 | a ai.joyue.joyuerpa.com:3001ai.joyue.joyuerpa.com:3001 | deepseek-r1:32b | 0.62 s Best: 0.45Worst: 1.59 | 39.64t/s | 15 |
| 23 | 1 1596492088489700.cn-shanghai.pai-eas.aliyuncs.com1596492088489700.cn-shanghai.pai-eas.aliyuncs.com | UI-TARS-72B-DPO | 0.62 s Best: 0.49Worst: 1.25 | 21.52t/s | 10 |
| 24 | SiliconFlowapi.siliconflow.cn | Pro/Qwen/Qwen2.5-Coder-7B-Instruct | 0.63 s Best: 0.58Worst: 0.66 | 28.32t/s | 5 |
| 25 | New APIapi.lianwusuoai.top | Pro/Qwen/Qwen2-7B-Instruct | 0.63 s Best: 0.55Worst: 0.84 | 96.25t/s | 5 |
| 26 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen2-7B-Instruct | 0.64 s Best: 0.55Worst: 0.71 | 93.55t/s | 5 |
| 27 | ChatAnywhereapi.chatanywhere.tech | gpt-4.1-nano | 0.65 s Best: 0.57Worst: 0.80 | 208.55t/s | 5 |
| 28 | New APIapi.lianwusuoai.top | THUDM/glm-4-9b-chat | 0.65 s Best: 0.58Worst: 0.85 | 71.17t/s | 5 |
| 29 | New APIapi.lianwusuoai.top | Qwen/Qwen2-1.5B-Instruct | 0.67 s Best: 0.52Worst: 0.82 | 213.84t/s | 5 |
| 30 | a api.centml.comapi.centml.com | deepseek-ai/DeepSeek-V3-0324 | 0.67 s Best: 0.58Worst: 0.79 | 59.16t/s | 5 |
| 31 | New APIapi.lianwusuoai.top | Qwen/Qwen2-7B-Instruct | 0.68 s Best: 0.54Worst: 0.88 | 93.64t/s | 5 |
| 32 | New APIapi.lianwusuoai.top | 免费Qwen2-1.5B | 0.68 s Best: 0.52Worst: 1.00 | 203.36t/s | 5 |
| 33 | New APIapi.lianwusuoai.top | Qwen/Qwen2.5-14B-Instruct | 0.69 s Best: 0.59Worst: 1.06 | 77.92t/s | 5 |
| 34 | New APIapi.lianwusuoai.top | 免费Qwen2.5-14B | 0.69 s Best: 0.61Worst: 0.93 | 77.95t/s | 5 |
| 35 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen2.5-Coder-7B-Instruct | 0.71 s Best: 0.61Worst: 1.05 | 41.42t/s | 10 |
| 36 | New APIapi.lianwusuoai.top | Pro/Qwen/Qwen2-VL-7B-Instruct | 0.71 s Best: 0.57Worst: 0.99 | 93.23t/s | 5 |
| 37 | 4 47.99.172.64:2301447.99.172.64:23014 | Qwen2.5-32B-Instruct | 0.71 s Best: 0.69Worst: 0.73 | 13.99t/s | 5 |
| 38 | New APIapi.lianwusuoai.top | THUDM/chatglm3-6b | 0.74 s Best: 0.59Worst: 0.92 | 32.18t/s | 5 |
| 39 | New APIapi.lianwusuoai.top | Qwen/Qwen2.5-72B-Instruct | 0.74 s Best: 0.67Worst: 0.93 | 34.81t/s | 5 |
| 40 | New APIapi.lianwusuoai.top | deepseek-ai/deepseek-vl2 | 0.75 s Best: 0.53Worst: 0.97 | 125.77t/s | 5 |
| 41 | New APIapi.lianwusuoai.top | Pro/Qwen/Qwen2.5-VL-7B-Instruct | 0.76 s Best: 0.55Worst: 1.02 | 85.51t/s | 5 |
| 42 | New APIapi.lianwusuoai.top | 免费GLM-4-9B-128K | 0.77 s Best: 0.57Worst: 0.91 | 73.70t/s | 5 |
| 43 | New APIapi.lianwusuoai.top | internlm/internlm2_5-20b-chat | 0.77 s Best: 0.59Worst: 0.94 | 61.81t/s | 5 |
| 44 | New APIapi.lianwusuoai.top | Qwen/Qwen2-VL-72B-Instruct | 0.78 s Best: 0.70Worst: 0.99 | 28.50t/s | 5 |
| 45 | New APIapi.lianwusuoai.top | Qwen/Qwen2.5-7B-Instruct | 0.79 s Best: 0.69Worst: 0.98 | 20.51t/s | 5 |
| 46 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen2.5-32B-Instruct | 0.81 s Best: 0.67Worst: 1.03 | 55.30t/s | 5 |
| 47 | Z Zhongzhuan Chatapi.zhongzhuan.chat | gemini-2.0-flash | 0.81 s Best: 0.68Worst: 0.86 | 190.98t/s | 5 |
| 48 | New APIapi.lianwusuoai.top | Qwen/QwQ-32B-Preview | 0.81 s Best: 0.63Worst: 0.96 | 72.57t/s | 5 |
| 49 | Hugging Faceakemidia-mua.hf.space | Grok 3 | 0.82 s Best: 0.64Worst: 1.12 | 51.18t/s | 5 |
| 50 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 0.82 s Best: 0.39Worst: 3.98 | 39.92t/s | 465 |