Leaderboard
Model performance rankings based on speed test results. Compare models across different providers and endpoints.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | jimmy | 0.58 s Best: 0.42Worst: 1.18 | 86213.91t/s | 10 | |
| 2 | jimmy | 0.59 s Best: 0.46Worst: 1.07 | 101506.95t/s | 5 | |
| 3 |
A AI Toolsplatform.aitools.cfd |
| zhipu/glm-4-flash |
0.86 s Best: 0.47Worst: 1.79 |
30.82t/s |
| 30 |
| 4 | A AI Toolsplatform.aitools.cfd | qwen/qwen2.5-7b | 0.92 s Best: 0.38Worst: 2.17 | 90.28t/s | 5 |
| 5 | XJY APIapi.xinjianya.top | grok-4.1-expert | 1.05 s Best: 0.74Worst: 1.53 | 33.09t/s | 5 |
| 6 | XJY APIapi.xinjianya.top | nvidia/nemotron-3-nano-30b-a3b | 1.15 s Best: 0.66Worst: 2.81 | 246.87t/s | 5 |
| 7 | 云 云智APIyunzhiapi.cn | Mimo-v2-Flash | 1.16 s Best: 0.32Worst: 18.38 | 0.00t/s | 45 |
| 8 | XJY APIapi.xinjianya.top | grok-4.1-fast | 1.37 s Best: 1.13Worst: 1.59 | 99.38t/s | 5 |
| 9 | a api.amethyst.ltdapi.amethyst.ltd | qwen-3.5-plus | 3.10 s Best: 2.54Worst: 3.91 | 55.05t/s | 5 |
| 10 | XJY APIapi.xinjianya.top | grok-imagine-1.0-fast | 4.80 s Best: 3.11Worst: 8.20 | 4998.02t/s | 15 |
| 11 | XJY APIapi.xinjianya.top | grok-4.1-mini | 7.00 s Best: 4.47Worst: 11.98 | 73.19t/s | 5 |