Leaderboard
Model performance rankings based on speed test results. Compare models across different providers and endpoints.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | google/gemini-2.0-flash-exp | 0.17 s Best: -Worst: 1.67 | 18.95t/s | 40 | |
| 2 | llama3.1-8b | 0.19 s Best: 0.15Worst: 0.21 | 2142.09t/s | 5 |
| 3 | Cerebrasapi.cerebras.ai | llama-3.3-70b | 0.25 s Best: 0.15Worst: 0.32 | 1532.55t/s | 5 |
| 4 | Hugging Facerouter.huggingface.co | meta-llama/Llama-3.3-70B-Instruct | 0.26 s Best: 0.24Worst: 0.28 | 416.19t/s | 5 |
| 5 | Hugging Facerouter.huggingface.co | meta-llama/Llama-3.3-70B-Instruct | 0.26 s Best: 0.24Worst: 0.28 | 416.19t/s | 5 |
| 6 | Mistral AIapi.mistral.ai | mistral-small-latest | 0.33 s Best: 0.30Worst: 0.37 | 74.79t/s | 5 |
| 7 | Mistral AIapi.mistral.ai | mistral-small-latest | 0.33 s Best: 0.30Worst: 0.37 | 74.79t/s | 5 |
| 8 | Mistral AIapi.mistral.ai | mistral-medium | 0.35 s Best: 0.32Worst: 0.37 | 99.70t/s | 5 |
| 9 | Mistral AIapi.mistral.ai | mistral-medium | 0.35 s Best: 0.32Worst: 0.37 | 99.70t/s | 5 |
| 10 | Mistral AIapi.mistral.ai | mistral-medium-latest | 0.37 s Best: 0.34Worst: 0.43 | 101.94t/s | 5 |
| 11 | Mistral AIapi.mistral.ai | mistral-medium-latest | 0.37 s Best: 0.34Worst: 0.43 | 101.94t/s | 5 |
| 12 | Mistral AIapi.mistral.ai | mistral-tiny-latest | 0.38 s Best: 0.36Worst: 0.42 | 155.43t/s | 5 |
| 13 | Mistral AIapi.mistral.ai | mistral-tiny-latest | 0.38 s Best: 0.36Worst: 0.42 | 155.43t/s | 5 |
| 14 | Mistral AIapi.mistral.ai | magistral-small-latest | 0.39 s Best: 0.33Worst: 0.56 | 189.91t/s | 5 |
| 15 | Mistral AIapi.mistral.ai | magistral-small-latest | 0.39 s Best: 0.33Worst: 0.56 | 189.91t/s | 5 |
| 16 | Mistral AIapi.mistral.ai | open-mistral-nemo | 0.40 s Best: 0.34Worst: 0.46 | 200.44t/s | 5 |
| 17 | Mistral AIapi.mistral.ai | open-mistral-nemo | 0.40 s Best: 0.34Worst: 0.46 | 200.44t/s | 5 |
| 18 | R Realpicsrealpics.cn:2234 | qwen3-30b-a3b | 0.45 s Best: 0.39Worst: 0.62 | 114.09t/s | 5 |
| 19 | r realpics.cn:2234realpics.cn:2234 | qwen3-30b-a3b | 0.45 s Best: 0.39Worst: 0.62 | 114.09t/s | 5 |
| 20 | r realpics.cn:2234realpics.cn:2234 | qwen3-30b-a3b | 0.45 s Best: 0.39Worst: 0.62 | 114.09t/s | 5 |
| 21 | Mistral AIapi.mistral.ai | ministral-14b-latest | 0.46 s Best: 0.31Worst: 0.66 | 129.12t/s | 5 |
| 22 | Mistral AIapi.mistral.ai | ministral-14b-latest | 0.46 s Best: 0.31Worst: 0.66 | 129.12t/s | 5 |
| 23 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/deepseek-v3p2 | 0.49 s Best: 0.45Worst: 0.62 | 114.61t/s | 5 |
| 24 | Mistral AIapi.mistral.ai | ministral-3b-2410 | 0.50 s Best: 0.30Worst: 0.70 | 332.18t/s | 5 |
| 25 | Mistral AIapi.mistral.ai | ministral-3b-2410 | 0.50 s Best: 0.30Worst: 0.70 | 332.18t/s | 5 |
| 26 | DashScopedashscope.aliyuncs.com | qwen-flash | 0.50 s Best: 0.33Worst: 1.10 | 137.25t/s | 15 |
| 27 | Cerebrasapi.cerebras.ai | gpt-oss-120b | 0.54 s Best: 0.25Worst: 1.05 | 1920.13t/s | 5 |
| 28 | a api.vectorengine.aiapi.vectorengine.ai | gemini-2.0-flash | 0.56 s Best: 0.50Worst: 0.63 | 175.01t/s | 5 |
| 29 | DashScopedashscope.aliyuncs.com | qwen-vl-plus-2025-07-10 | 0.56 s Best: 0.37Worst: 1.20 | 112.44t/s | 65 |
| 30 | g gpt.cosmoplat.comgpt.cosmoplat.com | qwen3-235b-fp8 | 0.56 s Best: 0.27Worst: 1.24 | 94.11t/s | 5 |
| 31 | DashScopedashscope.aliyuncs.com | qwen-plus-2025-04-28 | 0.57 s Best: 0.41Worst: 1.06 | 43.49t/s | 10 |
| 32 | a ai-hub.square-llm.comai-hub.square-llm.com | anthropic/claude-haiku-4.5 | 0.57 s Best: 0.44Worst: 0.72 | 98.63t/s | 5 |
| 33 | DashScopedashscope.aliyuncs.com | tongyi-intent-detect-v3 | 0.57 s Best: 0.43Worst: 1.05 | 83.35t/s | 5 |
| 34 | Mistral AIapi.mistral.ai | mistral-large-2411 | 0.58 s Best: 0.38Worst: 1.31 | 45.26t/s | 5 |
| 35 | Mistral AIapi.mistral.ai | mistral-large-2411 | 0.58 s Best: 0.38Worst: 1.31 | 45.26t/s | 5 |
| 36 | a api.vectorengine.aiapi.vectorengine.ai | gemini-2.0-flash-lite | 0.64 s Best: 0.33Worst: 2.46 | 135.07t/s | 10 |
| 37 | DashScopedashscope.aliyuncs.com | tongyi-xiaomi-analysis-pro | 0.64 s Best: 0.46Worst: 1.12 | 65.76t/s | 5 |
| 38 | DashScopedashscope.aliyuncs.com | qwen-plus | 0.66 s Best: 0.46Worst: 1.07 | 57.34t/s | 5 |
| 39 | New APInew.123nhh.xyz | gemini-flash-lite-latest | 0.67 s Best: 0.38Worst: 0.97 | 369.22t/s | 5 |
| 40 | DashScopedashscope.aliyuncs.com | qwen-plus-latest | 0.69 s Best: 0.52Worst: 1.16 | 57.95t/s | 5 |
| 41 | N New APIapi.seosycy.com | deepseek-v3.2 | 0.71 s Best: 0.51Worst: 1.29 | 27.18t/s | 10 |
| 42 | free_chatgpt_apifree.v36.cm | gpt-4o-mini | 0.73 s Best: 0.55Worst: 0.96 | 116.21t/s | 10 |
| 43 | DashScopedashscope.aliyuncs.com | qwen3-max-preview | 0.77 s Best: 0.55Worst: 1.56 | 44.58t/s | 5 |
| 44 | SiliconFlowapi.siliconflow.cn | Qwen/Qwen3-Coder-30B-A3B-Instruct | 0.78 s Best: 0.57Worst: 1.27 | 38.23t/s | 5 |
| 45 | DashScopedashscope.aliyuncs.com | qwen-max-latest | 0.79 s Best: 0.48Worst: 1.47 | 37.07t/s | 5 |
| 46 | Fireworks AIapi.fireworks.ai | accounts/fireworks/models/gpt-oss-120b | 0.79 s Best: 0.60Worst: 1.00 | 144.71t/s | 10 |
| 47 | DashScopedashscope.aliyuncs.com | qwen-plus-2025-12-01 | 0.80 s Best: 0.49Worst: 1.23 | 54.28t/s | 10 |
| 48 | 1 123.54.215.139:8008123.54.215.139:8008 | qwen3-32b | 0.81 s Best: 0.29Worst: 2.11 | 22.93t/s | 5 |
| 49 | ETOS APIapi.ericterminal.com | moonshotai/kimi-k2-instruct-0905 | 0.83 s Best: 0.72Worst: 1.06 | 148.30t/s | 5 |
| 50 | 简小智API中转站newapi.jianxiaozhi.chat:56897 | deepseek-v3.2-exp | 0.85 s Best: 0.64Worst: 1.05 | 27.14t/s | 5 |