Leaderboard
Model performance rankings based on speed test results. Compare models across different providers and endpoints.
Average tokens generated per second. Higher is better for fast responses.
| Rank | Provider | Model | Throughput | Avg first token latency | Total Tests |
|---|---|---|---|---|---|
| 1 | gemini-2.5-flash | 104600.84 t/s Best: 115213.03Worst: 81275.31 | 19.21s | 5 | |
| 2 | jimmy | 101506.95 t/s Best: 145658.50Worst: 13204.57 | 0.59s | 5 | |
| 3 |
| jimmy |
86213.91 t/s Best: 138352.88Worst: 42053.25 |
0.58s |
| 10 |
| 4 | llama3.1-8B | 38588.02 t/s Best: 148734.34Worst: 4954.13 | 0.52s | 25 |
| 5 | llama3.1-8B | 38588.02 t/s Best: 148734.34Worst: 4954.13 | 0.52s | 25 |
| 6 | llama3.1-8B | 37444.09 t/s Best: 73468.61Worst: 4399.22 | 0.39s | 20 |
| 7 | llama3.1-8B | 37444.09 t/s Best: 73468.61Worst: 4399.22 | 0.39s | 20 |
| 8 | translate-model | 31767.17 t/s Best: 48227.39Worst: 13109.67 | 1.06s | 5 |
| 9 | gemini-3-pro | 14970.09 t/s Best: 17228.53Worst: 9361.83 | 3.36s | 5 |
| 10 | [hy-z专线][价格:0.03]假流式/gemini-3-flash-preview | 13458.91 t/s Best: 20185.37Worst: 10206.45 | 11.56s | 5 |
| 11 | echo | 6152.15 t/s Best: 22934.12Worst: 506.54 | 0.75s | 15 |
| 12 | DeepSeek-V3.2 | 5418.46 t/s Best: 7619.40Worst: 18.85 | 1.39s | 15 |
| 13 | grok-imagine-1.0-fast | 4998.02 t/s Best: 7933.91Worst: 1462.69 | 4.80s | 15 |
| 14 | gemini-2.5-pro | 3661.62 t/s Best: 42899.94Worst: 59.79 | 25.24s | 25 |
| 15 | gpt-oss-120b | 1796.31 t/s Best: 2033.04Worst: 1592.31 | 0.49s | 5 |
| 16 | gpt-oss-120b | 1677.82 t/s Best: 2010.50Worst: 1370.01 | 0.56s | 10 |
| 17 | gpt-oss-120b | 1671.10 t/s Best: 2122.88Worst: 1222.85 | 0.72s | 20 |
| 18 | gpt-oss-120b | 1671.10 t/s Best: 2122.88Worst: 1222.85 | 0.72s | 20 |
| 19 | mercury-2 | 1653.71 t/s Best: 6228.40Worst: 371.89 | 2.50s | 5 |
| 20 | mercury-2 | 1653.71 t/s Best: 6228.40Worst: 371.89 | 2.50s | 5 |
| 21 | gpt-oss-120b | 1637.28 t/s Best: 1932.26Worst: 1351.73 | 0.91s | 10 |
| 22 | gpt-oss-120b | 1637.28 t/s Best: 1932.26Worst: 1351.73 | 0.91s | 10 |
| 23 | gpt-oss-120b | 1467.36 t/s Best: 1785.19Worst: 1053.27 | 0.82s | 5 |
| 24 | llama3.1-8B | 1421.44 t/s Best: 2829.11Worst: 100.62 | 0.94s | 10 |
| 25 | 快速/llama3.1-8B | 1258.69 t/s Best: 2172.93Worst: 595.62 | 1.24s | 15 |
| 26 | gpt-oss-120b | 970.77 t/s Best: 1257.59Worst: 429.54 | 0.94s | 5 |
| 27 | llama3.1-8b | 731.95 t/s Best: 1144.52Worst: 67.76 | 1.17s | 15 |
| 28 | echo-flash | 685.41 t/s Best: 869.81Worst: 400.49 | 0.08s | 5 |
| 29 | gemini-3.1-pro-preview | 664.77 t/s Best: 2782.93Worst: 79.68 | 25.06s | 5 |
| 30 | gemini-2.5-flash | 648.86 t/s Best: 4724.09Worst: 37.55 | 10.36s | 10 |
| 31 | gpt-5 | 519.69 t/s Best: 719.34Worst: 161.73 | 12.99s | 5 |
| 32 | groq/compound-mini | 477.57 t/s Best: 493.59Worst: 465.42 | 0.89s | 5 |
| 33 | gemini-3.1-pro-preview | 462.04 t/s Best: 1827.55Worst: 82.45 | 23.27s | 5 |
| 34 | glm-4.7 | 454.62 t/s Best: 735.18Worst: 237.51 | 2.87s | 10 |
| 35 | qwen2.5:1.5b | 435.32 t/s Best: 480.86Worst: 273.41 | 1.63s | 5 |
| 36 | qwen2.5:1.5b | 435.32 t/s Best: 480.86Worst: 273.41 | 1.63s | 5 |
| 37 | qwen2.5:1.5b | 435.32 t/s Best: 480.86Worst: 273.41 | 1.63s | 5 |
| 38 | glm-4.7-特别版 | 428.83 t/s Best: 542.27Worst: 347.54 | 3.12s | 5 |
| 39 | gemini-2.5-flash-lite | 410.62 t/s Best: 477.12Worst: 272.30 | 2.28s | 5 |
| 40 | tab_flash_lite_preview | 396.29 t/s Best: 588.50Worst: 268.05 | 1.41s | 5 |
| 41 | inception/mercury | 386.10 t/s Best: 525.30Worst: 123.95 | 1.26s | 15 |
| 42 | gpt-oss-20b:free | 339.45 t/s Best: 571.58Worst: 129.16 | 2.51s | 5 |
| 43 | gemini-3.0-flash | 336.24 t/s Best: 813.64Worst: 4.20 | 12.65s | 5 |
| 44 | 翻译/glm-4.7 | 308.58 t/s Best: 391.37Worst: 194.72 | 3.72s | 5 |
| 45 | [V]gemini-2.5-flash-lite | 301.65 t/s Best: 359.28Worst: 270.39 | 1.25s | 5 |
| 46 | ant_gemini-2.5-flash-lite | 290.47 t/s Best: 331.25Worst: 255.80 | 2.32s | 5 |
| 47 | gemini-2.5-flash-lite | 284.77 t/s Best: 335.73Worst: 253.26 | 1.37s | 5 |
| 48 | 流式抗截断/gemini-2.5-pro | 281.87 t/s Best: 480.57Worst: 142.19 | 15.62s | 5 |
| 49 | gemini-2.5-flash | 274.40 t/s Best: 416.35Worst: 211.94 | 8.21s | 5 |
| 50 | 翻译/标题/OCR模型 | 260.68 t/s Best: 316.11Worst: 238.96 | 0.91s | 5 |