Leaderboard
Model performance rankings based on speed test results. Compare models across different providers and endpoints.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | meta/llama-3.1-8b-instruct | 0.42 s Best: 0.39Worst: 0.45 | 200.27t/s | 15 | |
| 2 | zhipu/glm-4v-flash | 0.53 s Best: 0.33Worst: 0.99 | 54.32t/s | 5 |
| 3 | a api.amethyst.ltdapi.amethyst.ltd | jimmy | 0.58 s Best: 0.42Worst: 1.18 | 86213.91t/s | 10 |
| 4 | XJY APIapi.xinjianya.top | google/gemma-7b | 0.59 s Best: 0.47Worst: 0.99 | 45.13t/s | 5 |
| 5 | i imsnake.dart.us.ciimsnake.dart.us.ci | jimmy | 0.59 s Best: 0.46Worst: 1.07 | 101506.95t/s | 5 |
| 6 | XJY APIapi.xinjianya.top | ibm/granite-guardian-3.0-8b | 0.61 s Best: 0.54Worst: 1.15 | 93.66t/s | 10 |
| 7 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-9b | 0.64 s Best: 0.46Worst: 0.80 | 52.73t/s | 5 |
| 8 | Seamee APInapi.seaya.link | auto-translator | 0.70 s Best: 0.49Worst: 1.19 | 103.35t/s | 10 |
| 9 | integrate.api.nvidia.comintegrate.api.nvidia.com | qwen/qwen3.5-397b-a17b | 0.71 s Best: 0.35Worst: 1.36 | 23.36t/s | 5 |
| 10 | 素 素墨APIapifree.rensumo.top | echo | 0.75 s Best: 0.55Worst: 1.49 | 6152.15t/s | 15 |
| 11 | SiliconFlowapi.siliconflow.cn | deepseek-ai/DeepSeek-V3.2 | 0.80 s Best: 0.41Worst: 1.40 | 21.76t/s | 10 |
| 12 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 0.82 s Best: 0.58Worst: 1.23 | 27.94t/s | 5 |
| 13 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 0.92 s Best: 0.38Worst: 2.17 | 90.28t/s | 5 |
| 14 | 云 云智APIyunzhiapi.cn | DeepSeek-V3.2 | 0.94 s Best: 0.59Worst: 2.00 | 0.00t/s | 5 |
| 15 | 素 素墨APIapifree.rensumo.top | llama3.1-8B | 0.94 s Best: 0.73Worst: 1.32 | 1421.44t/s | 10 |
| 16 | A AI Toolsplatform.aitools.cfd | zhipu/glm-4-flash | 1.04 s Best: 0.45Worst: 10.56 | 30.09t/s | 405 |
| 17 | XJY APIapi.xinjianya.top | grok-4.1-expert | 1.05 s Best: 0.74Worst: 1.53 | 33.09t/s | 5 |
| 18 | 2 20230621.pp.ua20230621.pp.ua | translate-model | 1.06 s Best: 0.97Worst: 1.17 | 31767.17t/s | 5 |
| 19 | a apifs.shubiaobiao.cnapifs.shubiaobiao.cn | claude-sonnet-4-6 | 1.07 s Best: 0.86Worst: 1.74 | 46.71t/s | 5 |
| 20 | XJY APIapi.xinjianya.top | nvidia/nemotron-3-nano-30b-a3b | 1.15 s Best: 0.66Worst: 2.81 | 246.87t/s | 5 |
| 21 | 2 20230621.pp.ua20230621.pp.ua | translate-model | 1.17 s Best: 0.91Worst: 1.39 | 169.06t/s | 5 |
| 22 | 素 素墨APIapifree.rensumo.top | llama3.1-8b | 1.17 s Best: 0.56Worst: 3.95 | 731.95t/s | 15 |
| 23 | 云 云智APIyunzhiapi.cn | Mimo-v2-Flash | 1.21 s Best: 0.32Worst: 18.38 | 46.11t/s | 75 |
| 24 | A AI Toolsplatform.aitools.cfd | google/gemma-3-27b | 1.23 s Best: 0.85Worst: 1.78 | 53.59t/s | 5 |
| 25 | 素 素墨APIapifree.rensumo.top | 快速/llama3.1-8B | 1.24 s Best: 0.73Worst: 2.23 | 1258.69t/s | 15 |
| 26 | a api.123nhh.meapi.123nhh.me | GPT-5.3 Codex | 1.25 s Best: 0.85Worst: 1.60 | 43.66t/s | 5 |
| 27 | a api.rnglg2.top:30000api.rnglg2.top:30000 | inception/mercury | 1.26 s Best: 0.69Worst: 1.88 | 386.10t/s | 15 |
| 28 | 包 包子铺api.5202030.xyz | claude-sonnet-4-5-20250929 | 1.27 s Best: 1.15Worst: 1.68 | 125.92t/s | 5 |
| 29 | XJY APIapi.xinjianya.top | grok-4.1-fast | 1.37 s Best: 1.13Worst: 1.59 | 99.38t/s | 5 |
| 30 | XJY APIapi.xinjianya.top | qwen/qwen3.5-397b-a17b | 1.56 s Best: 0.59Worst: 4.76 | 43.19t/s | 5 |
| 31 | 云 云智APIyunzhiapi.cn | Step-3.5-Flash | 1.57 s Best: 0.26Worst: 10.42 | 45.74t/s | 20 |
| 32 | a api.123nhh.meapi.123nhh.me | GPT-5.3 Codex Spark | 1.67 s Best: 1.47Worst: 2.11 | 45.83t/s | 5 |
| 33 | DashScopecoding.dashscope.aliyuncs.com | qwen3-max-2026-01-23 | 1.69 s Best: 0.92Worst: 2.67 | 33.19t/s | 5 |
| 34 | c coding.dashscope.aliyuncs.comcoding.dashscope.aliyuncs.com | qwen3-max-2026-01-23 | 1.69 s Best: 0.92Worst: 2.67 | 33.19t/s | 5 |
| 35 | XJY APIapi.xinjianya.top | Kimi-K2.5 | 1.88 s Best: 1.02Worst: 4.81 | 19.85t/s | 5 |
| 36 | n newapi.kzwbelieve.topnewapi.kzwbelieve.top | claude-sonnet-4-6 | 1.94 s Best: 1.51Worst: 2.42 | 48.66t/s | 5 |
| 37 | a api.123nhh.meapi.123nhh.me | GPT-5.2 | 1.95 s Best: 1.12Worst: 5.05 | 45.37t/s | 5 |
| 38 | 并行科技llmapi.paratera.com | MiniMax-M2.5 | 2.19 s Best: 1.47Worst: 2.97 | 42.92t/s | 5 |
| 39 | c coding.dashscope.aliyuncs.comcoding.dashscope.aliyuncs.com | kimi-k2.5 | 2.32 s Best: 0.91Worst: 7.17 | 24.44t/s | 5 |
| 40 | DashScopecoding.dashscope.aliyuncs.com | kimi-k2.5 | 2.32 s Best: 0.91Worst: 7.17 | 24.44t/s | 5 |
| 41 | 钠 APIus.naapi.cc | mercury-2 | 2.50 s Best: 1.07Worst: 4.62 | 1653.71t/s | 5 |
| 42 | 钠 APIus.naapi.cc | mercury-2 | 2.50 s Best: 1.07Worst: 4.62 | 1653.71t/s | 5 |
| 43 | o optai.cap.1ktower.comoptai.cap.1ktower.com | claude-opus-4-6 | 2.54 s Best: 1.26Worst: 4.06 | 43.21t/s | 25 |
| 44 | c ck67.topck67.top | claude-sonnet-4-6 | 3.07 s Best: 2.47Worst: 4.35 | 37.66t/s | 5 |
| 45 | a api.amethyst.ltdapi.amethyst.ltd | qwen-3.5-plus | 3.10 s Best: 2.54Worst: 3.91 | 55.05t/s | 5 |
| 46 | S SWT-APIapi.lhyb.dpdns.org | gemini-3-pro | 3.36 s Best: 2.06Worst: 5.04 | 14970.09t/s | 5 |
| 47 | 包 包子铺api.5202030.xyz | grok-4 | 3.54 s Best: 1.73Worst: 7.71 | 125.14t/s | 5 |
| 48 | a api.modelarts-maas.comapi.modelarts-maas.com | deepseek-v3.2 | 3.56 s Best: 1.60Worst: 6.78 | 25.48t/s | 5 |
| 49 | 云 云智APIyunzhiapi.cn | Claude-Opus-4-6 | 3.65 s Best: 3.18Worst: 4.31 | 39.01t/s | 5 |
| 50 | A AI Toolsplatform.aitools.cfd | google/gemma-3-27b | 4.29 s Best: 0.81Worst: 31.01 | 52.12t/s | 10 |