Leaderboard
Model performance rankings based on speed test results. Compare models across different providers and endpoints.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | qwen/qwen3-coder | 0.03 s Best: -Worst: 1.49 | 0.81t/s | 90 | |
| 2 | echo-flash | 0.08 s Best: 0.07Worst: 0.12 | 685.41t/s | 5 | |
| 3 |
A A3a3.awsl.app |
| kimi-k2-instruct-0905 |
0.29 s Best: 0.26Worst: 0.34 |
226.90t/s |
| 10 |
| 4 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-9b | 0.34 s Best: -Worst: 0.86 | 42.65t/s | 30 |
| 5 | 钠 APIus.naapi.cc | llama3.1-8B | 0.39 s Best: 0.31Worst: 1.06 | 37444.09t/s | 20 |
| 6 | 钠 APIus.naapi.cc | llama3.1-8B | 0.39 s Best: 0.31Worst: 1.06 | 37444.09t/s | 20 |
| 7 | XJY APIapi.xinjianya.top | meta/llama-3.1-8b-instruct | 0.42 s Best: 0.39Worst: 0.45 | 200.27t/s | 15 |
| 8 | A AI Toolsplatform.aitools.cfd | deepseek/deepseek-v3-0324 | 0.42 s Best: -Worst: 3.28 | 0.00t/s | 210 |
| 9 | a api.xiaomimimo.comapi.xiaomimimo.com | mimo-v2-flash | 0.45 s Best: 0.34Worst: 0.75 | 129.75t/s | 5 |
| 10 | Seamee APInapi.seaya.link | GLM-4-FlashX | 0.47 s Best: 0.43Worst: 0.55 | 64.39t/s | 5 |
| 11 | 玄黄apis.soys.site | gpt-oss-120b | 0.49 s Best: 0.40Worst: 0.73 | 1796.31t/s | 5 |
| 12 | XJY APIapi.xinjianya.top | google/gemma-2-9b-it | 0.52 s Best: 0.48Worst: 0.68 | 36.05t/s | 10 |
| 13 | XJY APIapi.xinjianya.top | google/gemma-3-1b-it | 0.52 s Best: 0.43Worst: 0.99 | 213.18t/s | 10 |
| 14 | 钠 APInaapi.cc | llama3.1-8B | 0.52 s Best: 0.34Worst: 1.09 | 38588.02t/s | 25 |
| 15 | 钠 APInaapi.cc | llama3.1-8B | 0.52 s Best: 0.34Worst: 1.09 | 38588.02t/s | 25 |
| 16 | XJY APIapi.xinjianya.top | google/gemma-2-27b-it | 0.55 s Best: 0.47Worst: 0.97 | 44.61t/s | 10 |
| 17 | n newapi.medu.chatnewapi.medu.chat | gpt-oss-120b | 0.56 s Best: 0.40Worst: 1.38 | 1677.82t/s | 10 |
| 18 | a api.amethyst.ltdapi.amethyst.ltd | jimmy | 0.58 s Best: 0.42Worst: 1.18 | 86213.91t/s | 10 |
| 19 | i imsnake.dart.us.ciimsnake.dart.us.ci | jimmy | 0.59 s Best: 0.46Worst: 1.07 | 101506.95t/s | 5 |
| 20 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4v-flash | 0.59 s Best: 0.32Worst: 2.76 | 57.28t/s | 25 |
| 21 | XJY APIapi.xinjianya.top | google/gemma-7b | 0.60 s Best: 0.47Worst: 1.15 | 45.48t/s | 10 |
| 22 | XJY APIapi.xinjianya.top | google/gemma-3n-e4b-it | 0.60 s Best: 0.54Worst: 0.78 | 19.36t/s | 5 |
| 23 | XJY APIapi.xinjianya.top | ibm/granite-guardian-3.0-8b | 0.61 s Best: 0.54Worst: 1.15 | 93.66t/s | 10 |
| 24 | XJY APIapi.xinjianya.top | igenius/italia_10b_instruct_16k | 0.62 s Best: 0.48Worst: 1.15 | 136.00t/s | 10 |
| 25 | a api.amethyst.ltdapi.amethyst.ltd | kimi-k2-instruct-0905 | 0.63 s Best: 0.47Worst: 0.94 | 74.54t/s | 5 |
| 26 | XJY APIapi.xinjianya.top | meta/llama-3.2-3b-instruct | 0.65 s Best: 0.44Worst: 0.97 | 101.40t/s | 5 |
| 27 | Seamee APInapi.seaya.link | auto-translator | 0.67 s Best: 0.44Worst: 1.19 | 99.75t/s | 25 |
| 28 | integrate.api.nvidia.comintegrate.api.nvidia.com | qwen/qwen3.5-397b-a17b | 0.71 s Best: 0.35Worst: 1.36 | 23.36t/s | 5 |
| 29 | 钠 APIus.naapi.cc | gpt-oss-120b | 0.72 s Best: 0.45Worst: 1.93 | 1671.10t/s | 20 |
| 30 | 钠 APIus.naapi.cc | gpt-oss-120b | 0.72 s Best: 0.45Worst: 1.93 | 1671.10t/s | 20 |
| 31 | 素 素墨APIapifree.rensumo.top | echo | 0.75 s Best: 0.55Worst: 1.49 | 6152.15t/s | 15 |
| 32 | 钠 APInaapi.cc | zai-org/GLM-4.5-Air | 0.78 s Best: 0.54Worst: 1.45 | 68.81t/s | 5 |
| 33 | 钠 APInaapi.cc | zai-org/GLM-4.5-Air | 0.78 s Best: 0.54Worst: 1.45 | 68.81t/s | 5 |
| 34 | SiliconFlowapi.siliconflow.cn | deepseek-ai/DeepSeek-V3.2 | 0.80 s Best: 0.41Worst: 1.40 | 21.76t/s | 10 |
| 35 | A AI Toolsplatform.aitools.cfd | qwen/qwen3-8b | 0.80 s Best: 0.55Worst: 1.50 | 28.62t/s | 20 |
| 36 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 0.81 s Best: 0.34Worst: 2.43 | 103.58t/s | 40 |
| 37 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 0.82 s Best: 0.58Worst: 1.23 | 27.94t/s | 5 |
| 38 | 素 素墨APIapifree.rensumo.top | google/gemma-3-1b-it | 0.87 s Best: 0.60Worst: 1.68 | 176.46t/s | 5 |
| 39 | DeepSeekapi.deepseek.com | deepseek-chat | 0.88 s Best: 0.73Worst: 1.11 | 34.84t/s | 15 |
| 40 | A A3a3.awsl.app | groq/compound-mini | 0.89 s Best: 0.53Worst: 1.40 | 477.57t/s | 5 |
| 41 | 玄黄apis.soys.site | 翻译/标题/OCR模型 | 0.91 s Best: 0.73Worst: 1.21 | 260.68t/s | 5 |
| 42 | DeepSeekapi.deepseek.com | deepseek-chat | 0.91 s Best: 0.72Worst: 1.24 | 40.56t/s | 10 |
| 43 | 钠 APInaapi.cc | gpt-oss-120b | 0.91 s Best: 0.54Worst: 2.00 | 1637.28t/s | 10 |
| 44 | 钠 APInaapi.cc | gpt-oss-120b | 0.91 s Best: 0.54Worst: 2.00 | 1637.28t/s | 10 |
| 45 | ModelScopeapi-inference.modelscope.cn | Qwen/Qwen2.5-7B-Instruct | 0.91 s Best: 0.61Worst: 1.95 | 43.96t/s | 5 |
| 46 | 素 素墨APIapifree.rensumo.top | gpt-oss-120b | 0.94 s Best: 0.59Worst: 1.26 | 970.77t/s | 5 |
| 47 | 云 云智APIyunzhiapi.cn | DeepSeek-V3.2 | 0.94 s Best: 0.59Worst: 2.00 | 0.00t/s | 5 |
| 48 | 素 素墨APIapifree.rensumo.top | llama3.1-8B | 0.94 s Best: 0.73Worst: 1.32 | 1421.44t/s | 10 |
| 49 | New APIapi.lianwusuoai.top | 翻译 | 0.97 s Best: 0.72Worst: 1.26 | 60.49t/s | 5 |
| 50 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 0.97 s Best: 0.42Worst: 28.79 | 30.14t/s | 1680 |