Leaderboard
Multi-dimensional rankings based on model speed tests and provider health checks. Compare providers, endpoints, and reliability at a glance.
Average tokens generated per second. Higher is better for fast responses.
| Rank | Provider | Model | Throughput | Avg first token latency | Total Tests |
|---|---|---|---|---|---|
| 1 | gemini-2.5-flash | 15203.32 t/s Best: 18874.81Worst: 8423.33 | 17.11s | 5 | |
| 2 | gemini-2.0-flash | 12473.78 t/s Best: 29607.98Worst: 3154.37 | 6.15s | 5 | |
| 3 |
| gemini-2.0-flash |
7751.02 t/s Best: 17561.32Worst: 149.52 |
4.46s |
| 5 |
| 4 | glm-4.6 | 2667.21 t/s Best: 8180.16Worst: 20.04 | 6.91s | 10 |
| 5 | qwen-3-235b-a22b-instruct-2507 | 758.45 t/s Best: 1005.24Worst: 516.76 | 3.70s | 5 |
| 6 | qwen-3-235b-a22b-instruct-2507 | 703.17 t/s Best: 1007.43Worst: 443.12 | 0.60s | 15 |
| 7 | qwen-3-coder-480b | 648.92 t/s Best: 798.45Worst: 441.31 | 0.66s | 5 |
| 8 | Qwen/Qwen3-235B | 583.68 t/s Best: 753.41Worst: 372.98 | 0.71s | 5 |
| 9 | Qwen/Qwen3-Coder-480B | 552.48 t/s Best: 647.16Worst: 384.16 | 0.89s | 5 |
| 10 | gemini-2.5-pro-search | 462.98 t/s Best: 1859.60Worst: 86.39 | 17.54s | 5 |
| 11 | qwen-3-coder-480b | 388.56 t/s Best: 570.14Worst: 162.84 | 1.41s | 5 |
| 12 | llama-4-scout-17b-16e-instruct | 334.77 t/s Best: 503.37Worst: 232.80 | 0.84s | 5 |
| 13 | gemini-flash-lite-latest | 326.55 t/s Best: 398.13Worst: 252.01 | 0.95s | 5 |
| 14 | gpt-oss-120b | 308.11 t/s Best: 865.79Worst: 58.37 | 1.51s | 25 |
| 15 | gpt-oss-20b | 262.98 t/s Best: 281.90Worst: 246.24 | 2.49s | 10 |
| 16 | THUDM/GLM-Z1-9B-0414 | 246.67 t/s Best: 2713.72Worst: 0.00 | 11.06s | 15 |
| 17 | gpt-oss:20b | 243.42 t/s Best: 245.75Worst: 240.89 | 2.62s | 5 |
| 18 | gpt-oss:20b | 243.42 t/s Best: 245.75Worst: 240.89 | 2.62s | 5 |
| 19 | gpt-oss:20b | 243.42 t/s Best: 245.75Worst: 240.89 | 2.62s | 5 |
| 20 | gemini-2.5-flash-lite | 236.71 t/s Best: 336.23Worst: 186.47 | 2.61s | 5 |
| 21 | MBZUAI-IFM/K2-Think | 227.48 t/s Best: 254.54Worst: 209.92 | 2.43s | 5 |
| 22 | MBZUAI-IFM/K2-Think-nothink | 222.17 t/s Best: 253.25Worst: 203.36 | 1.80s | 5 |
| 23 | moonshotai/kimi-k2-instruct-0905 | 217.81 t/s Best: 257.62Worst: 186.94 | 0.95s | 5 |
| 24 | moonshotai/kimi-k2-instruct-0905 | 216.36 t/s Best: 329.19Worst: 49.98 | 0.89s | 30 |
| 25 | gpt-oss-120b | 215.89 t/s Best: 244.05Worst: 120.03 | 1.37s | 5 |
| 26 | ministral-3b-2410 | 210.88 t/s Best: 236.82Worst: 192.20 | 0.41s | 5 |
| 27 | ministral-3b-2410 | 210.88 t/s Best: 236.82Worst: 192.20 | 0.41s | 5 |
| 28 | Qwen/Qwen3-235B | 194.31 t/s Best: 625.84Worst: 39.48 | 1.96s | 20 |
| 29 | qwen/qwen3-next-80b-a3b-instruct | 177.68 t/s Best: 248.70Worst: 110.24 | 1.31s | 5 |
| 30 | gpt-5-nano | 165.65 t/s Best: 216.05Worst: 104.15 | 9.86s | 5 |
| 31 | qwen/qwen3-next-80b-a3b-instruct | 158.38 t/s Best: 166.10Worst: 145.23 | 0.50s | 5 |
| 32 | qwen3-next-80b-a3b-instruct | 153.69 t/s Best: 178.83Worst: 138.32 | 3.99s | 5 |
| 33 | gemini-2.0-flash | 153.26 t/s Best: 178.38Worst: 130.30 | 0.74s | 5 |
| 34 | paid-gemini-2.5-flash-preview-05-20 | 147.89 t/s Best: 188.26Worst: 120.76 | 14.02s | 5 |
| 35 | qwen3-vl-flash | 147.58 t/s Best: 149.70Worst: 145.32 | 1.00s | 5 |
| 36 | gpt-4.1-nano-2025-04-14 | 144.60 t/s Best: 173.91Worst: 112.84 | 1.37s | 5 |
| 37 | Qwen/Qwen3-VL-8B-Instruct | 142.73 t/s Best: 145.58Worst: 139.55 | 0.64s | 5 |
| 38 | gpt-5-mini | 141.93 t/s Best: 175.53Worst: 115.84 | 5.36s | 5 |
| 39 | Qwen/Qwen3-Next-80B-A3B-Instruct | 139.29 t/s Best: 167.01Worst: 100.29 | 0.95s | 5 |
| 40 | LongCat-Flash-Thinking | 137.67 t/s Best: 228.79Worst: 92.00 | 8.26s | 5 |
| 41 | gemini-2.5-pro-preview-06-05-search | 133.61 t/s Best: 164.48Worst: 104.05 | 10.11s | 5 |
| 42 | hunyuan-lite | 133.29 t/s Best: 141.70Worst: 122.55 | 0.99s | 5 |
| 43 | deepseek-ai/DeepSeek-V3.1 | 118.15 t/s Best: 160.78Worst: 48.82 | 1.73s | 5 |
| 44 | gemini-2.5-flash | 118.00 t/s Best: 143.13Worst: 84.63 | 13.83s | 5 |
| 45 | deepseek-ai/deepseek-r1 | 117.81 t/s Best: 139.55Worst: 95.60 | 7.08s | 5 |
| 46 | kimi-k2-turbo-preview | 113.22 t/s Best: 120.26Worst: 110.13 | 1.32s | 5 |
| 47 | gemini-2.5-pro-preview-06-05-search | 109.16 t/s Best: 127.68Worst: 84.71 | 10.38s | 5 |
| 48 | gpt-5-mini | 102.11 t/s Best: 169.01Worst: 53.95 | 7.39s | 5 |
| 49 | gemini-2.5-pro-preview-06-05-search | 101.33 t/s Best: 112.94Worst: 85.05 | 13.62s | 5 |
| 50 | gemini-2.5-pro-search | 101.30 t/s Best: 130.71Worst: 80.72 | 16.92s | 5 |