Leaderboard
Model performance rankings based on speed test results. Compare models across different providers and endpoints.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | google/gemini-2.0-flash-exp | 0.09 s Best: -Worst: 0.90 | 11.80t/s | 20 | |
| 2 | meta-llama/Llama-3.3-70B-Instruct | 0.26 s Best: 0.24Worst: 0.28 | 416.19t/s | 5 |
| 3 | Hugging Facerouter.huggingface.co | meta-llama/Llama-3.3-70B-Instruct | 0.26 s Best: 0.24Worst: 0.28 | 416.19t/s | 5 |
| 4 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/deepseek-v3p2 | 0.49 s Best: 0.45Worst: 0.62 | 114.61t/s | 5 |
| 5 | DashScopedashscope.aliyuncs.com | qwen-flash | 0.51 s Best: 0.36Worst: 0.91 | 142.89t/s | 5 |
| 6 | g gpt.cosmoplat.comgpt.cosmoplat.com | qwen3-235b-fp8 | 0.56 s Best: 0.27Worst: 1.24 | 94.11t/s | 5 |
| 7 | DashScopedashscope.aliyuncs.com | tongyi-intent-detect-v3 | 0.57 s Best: 0.43Worst: 1.05 | 83.35t/s | 5 |
| 8 | DashScopedashscope.aliyuncs.com | tongyi-xiaomi-analysis-pro | 0.64 s Best: 0.46Worst: 1.12 | 65.76t/s | 5 |
| 9 | DashScopedashscope.aliyuncs.com | qwen-plus | 0.66 s Best: 0.46Worst: 1.07 | 57.34t/s | 5 |
| 10 | DashScopedashscope.aliyuncs.com | qwen-plus-2025-12-01 | 0.67 s Best: 0.49Worst: 1.17 | 56.15t/s | 5 |
| 11 | DashScopedashscope.aliyuncs.com | qwen-plus-latest | 0.69 s Best: 0.52Worst: 1.16 | 57.95t/s | 5 |
| 12 | DashScopedashscope.aliyuncs.com | deepseek-v3.2 | 0.69 s Best: 0.52Worst: 1.36 | 34.40t/s | 10 |
| 13 | DashScopedashscope.aliyuncs.com | qwen-max-latest | 0.79 s Best: 0.48Worst: 1.47 | 37.07t/s | 5 |
| 14 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/gpt-oss-120b | 0.79 s Best: 0.60Worst: 1.00 | 144.71t/s | 10 |
| 15 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 0.80 s Best: 0.63Worst: 1.26 | 96.94t/s | 5 |
| 16 | 心流apis.iflow.cn | deepseek-v3.2 | 1.14 s Best: 0.89Worst: 1.47 | 44.90t/s | 10 |
| 17 | a arkark.cn-beijing.volces.com | deepseek-v3-2-251201 | 1.23 s Best: 0.81Worst: 1.44 | 32.39t/s | 5 |
| 18 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/qwen3-vl-235b-a22b-thinking | 1.25 s Best: 0.44Worst: 3.94 | 51.54t/s | 5 |
| 19 | DashScopedashscope.aliyuncs.com | qwen3-max-2025-09-23 | 1.36 s Best: 0.67Worst: 2.10 | 28.32t/s | 5 |
| 20 | s sd.rnglg2.top:30000sd.rnglg2.top:30000 | gpt-5.2 | 1.43 s Best: 1.01Worst: 2.59 | 55.45t/s | 5 |
| 21 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 1.53 s Best: 1.34Worst: 1.87 | 33.72t/s | 5 |
| 22 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 1.63 s Best: 0.91Worst: 5.67 | 29.71t/s | 40 |
| 23 | 酒馆无限制免费APIapi2.aoyou.shop | 酒馆-Flash-Long | 1.75 s Best: 1.47Worst: 2.51 | 194.62t/s | 5 |
| 24 | s sd.rnglg2.top:30000sd.rnglg2.top:30000 | gpt-5.1-codex-mini | 1.91 s Best: 1.37Worst: 3.16 | 248.33t/s | 5 |
| 25 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/minimax-m2p1 | 1.91 s Best: 1.02Worst: 4.32 | 185.81t/s | 10 |
| 26 | s sd.rnglg2.top:30000sd.rnglg2.top:30000 | gpt-oss-120b-medium | 2.58 s Best: 1.69Worst: 2.98 | 329.97t/s | 5 |
| 27 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 2.66 s Best: 0.71Worst: 12.37 | 86.57t/s | 10 |
| 28 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4v-flash | 2.95 s Best: 0.86Worst: 8.95 | 55.76t/s | 5 |
| 29 | s sd.rnglg2.top:30000sd.rnglg2.top:30000 | gpt-5-codex-mini | 3.35 s Best: 1.80Worst: 4.23 | 327.07t/s | 5 |
| 30 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/kimi-k2-thinking | 3.47 s Best: 2.18Worst: 7.16 | 127.75t/s | 5 |
| 31 | Cerebrasapi.cerebras.ai | zai-glm-4.7 | 3.57 s Best: 1.90Worst: 8.17 | 454.25t/s | 5 |
| 32 | SiliconFlowapi.siliconflow.cn | Pro/deepseek-ai/DeepSeek-V3.2 | 6.05 s Best: 4.18Worst: 8.06 | 44.24t/s | 5 |
| 33 | Gitee AIai.gitee.com | DeepSeek-V3.2 | 6.18 s Best: 4.73Worst: 7.81 | 35.11t/s | 5 |
| 34 | z zenmux.aizenmux.ai | deepseek/deepseek-v3.2 | 6.43 s Best: 4.82Worst: 7.91 | 51.09t/s | 5 |
| 35 | DeepSeekapi.deepseek.com | deepseek-reasoner | 6.77 s Best: 5.82Worst: 8.05 | 34.30t/s | 5 |
| 36 | z zenmux.aizenmux.ai | deepseek/deepseek-reasoner | 7.02 s Best: 6.03Worst: 7.89 | 30.30t/s | 5 |
| 37 | DashScopedashscope.aliyuncs.com | qwen3-235b-a22b-thinking-2507 | 8.90 s Best: 4.83Worst: 14.04 | 68.48t/s | 5 |
| 38 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/glm-4p6 | 10.03 s Best: 3.70Worst: 15.34 | 69.26t/s | 5 |
| 39 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/glm-4p7 | 10.19 s Best: 6.14Worst: 20.15 | 107.96t/s | 15 |
| 40 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/glm-4p7 | 11.45 s Best: 8.45Worst: 15.19 | 98.48t/s | 5 |
| 41 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4.5-flash | 12.30 s Best: 6.36Worst: 28.22 | 51.13t/s | 5 |
| 42 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4.6v-flash | 12.98 s Best: 5.79Worst: 22.31 | 134.33t/s | 5 |
| 43 | DashScopedashscope.aliyuncs.com | deepseek-r1 | 13.81 s Best: 8.59Worst: 21.62 | 37.34t/s | 5 |
| 44 | 算 算了么 APIapi.suanli.cn | Qwen/Qwen3-VL-32B-Thinking | 14.29 s Best: 7.51Worst: 24.50 | 44.46t/s | 5 |
| 45 | DashScopedashscope.aliyuncs.com | kimi-k2-thinking | 14.94 s Best: 6.95Worst: 26.88 | 43.87t/s | 5 |
| 46 | A AI Toolsplatform.aitools.cfd | deepseek/deepseek-r1-0528 | 15.65 s Best: 6.83Worst: 26.81 | 28.38t/s | 5 |
| 47 | a api.siliconflow.comapi.siliconflow.com | zai-org/GLM-4.7 | 16.10 s Best: 13.11Worst: 20.02 | 74.72t/s | 5 |
| 48 | A AI Toolsplatform.aitools.cfd | deepseek/deepseek-r1-0528 | 16.54 s Best: 10.83Worst: 28.30 | 29.19t/s | 5 |
| 49 | A AI Toolsplatform.aitools.cfd | qwen/qwen3-coder | 16.95 s Best: -Worst: 93.72 | 20.89t/s | 10 |
| 50 | SiliconFlowapi.siliconflow.cn | Pro/zai-org/GLM-4.7 | 17.14 s Best: 14.50Worst: 21.95 | 73.01t/s | 5 |