Leaderboard
Multi-dimensional rankings based on model speed tests and provider health checks. Compare providers, endpoints, and reliability at a glance.
Average time to first token. Lower is better for responsiveness.
| Rank | Provider | Model | First Token Latency | Avg tokens per second | Total Tests |
|---|---|---|---|---|---|
| 1 | google/gemini-2.0-flash-exp | 0.08 s Best: -Worst: 1.82 | 8.85t/s | 60 | |
| 2 | meta-llama/Llama-3.3-70B-Instruct | 0.26 s Best: 0.24Worst: 0.28 | 416.19t/s | 5 |
| 3 | meta-llama/Llama-3.3-70B-Instruct | 0.26 s Best: 0.24Worst: 0.28 | 416.19t/s | 5 |
| 4 | mistral-small-latest | 0.33 s Best: 0.30Worst: 0.37 | 74.79t/s | 5 |
| 5 | mistral-small-latest | 0.33 s Best: 0.30Worst: 0.37 | 74.79t/s | 5 |
| 6 | mistral-medium | 0.35 s Best: 0.32Worst: 0.37 | 99.70t/s | 5 |
| 7 | mistral-medium | 0.35 s Best: 0.32Worst: 0.37 | 99.70t/s | 5 |
| 8 | mistral-medium-latest | 0.37 s Best: 0.34Worst: 0.43 | 101.94t/s | 5 |
| 9 | mistral-medium-latest | 0.37 s Best: 0.34Worst: 0.43 | 101.94t/s | 5 |
| 10 | mistral-tiny-latest | 0.38 s Best: 0.36Worst: 0.42 | 155.43t/s | 5 |
| 11 | mistral-tiny-latest | 0.38 s Best: 0.36Worst: 0.42 | 155.43t/s | 5 |
| 12 | magistral-small-latest | 0.39 s Best: 0.33Worst: 0.56 | 189.91t/s | 5 |
| 13 | magistral-small-latest | 0.39 s Best: 0.33Worst: 0.56 | 189.91t/s | 5 |
| 14 | open-mistral-nemo | 0.40 s Best: 0.34Worst: 0.46 | 200.44t/s | 5 |
| 15 | open-mistral-nemo | 0.40 s Best: 0.34Worst: 0.46 | 200.44t/s | 5 |
| 16 | qwen3-30b-a3b | 0.45 s Best: 0.39Worst: 0.62 | 114.09t/s | 5 |
| 17 | qwen3-30b-a3b | 0.45 s Best: 0.39Worst: 0.62 | 114.09t/s | 5 |
| 18 | qwen3-30b-a3b | 0.45 s Best: 0.39Worst: 0.62 | 114.09t/s | 5 |
| 19 | test | 0.46 s Best: 0.29Worst: 0.70 | 583.14t/s | 5 |
| 20 | ministral-14b-latest | 0.46 s Best: 0.31Worst: 0.66 | 129.12t/s | 5 |
| 21 | ministral-14b-latest | 0.46 s Best: 0.31Worst: 0.66 | 129.12t/s | 5 |
| 22 | accounts/fireworks/models/deepseek-v3p2 | 0.49 s Best: 0.45Worst: 0.62 | 114.61t/s | 5 |
| 23 | ministral-3b-2410 | 0.50 s Best: 0.30Worst: 0.70 | 332.18t/s | 5 |
| 24 | ministral-3b-2410 | 0.50 s Best: 0.30Worst: 0.70 | 332.18t/s | 5 |
| 25 | qwen-flash | 0.51 s Best: 0.36Worst: 0.91 | 142.89t/s | 5 |
| 26 | gemini-2.5-flash-lite | 0.52 s Best: 0.44Worst: 0.64 | 297.64t/s | 5 |
| 27 | gemini-2.0-flash | 0.56 s Best: 0.50Worst: 0.63 | 175.01t/s | 5 |
| 28 | qwen-vl-plus-2025-07-10 | 0.56 s Best: 0.37Worst: 1.20 | 112.44t/s | 65 |
| 29 | qwen3-vl-flash | 0.56 s Best: 0.49Worst: 0.66 | 85.94t/s | 5 |
| 30 | qwen3-235b-fp8 | 0.56 s Best: 0.27Worst: 1.24 | 94.11t/s | 5 |
| 31 | qwen-plus-2025-04-28 | 0.57 s Best: 0.41Worst: 1.06 | 43.49t/s | 10 |
| 32 | tongyi-intent-detect-v3 | 0.57 s Best: 0.43Worst: 1.05 | 83.35t/s | 5 |
| 33 | mistral-large-2411 | 0.58 s Best: 0.38Worst: 1.31 | 45.26t/s | 5 |
| 34 | mistral-large-2411 | 0.58 s Best: 0.38Worst: 1.31 | 45.26t/s | 5 |
| 35 | qwen-flash | 0.63 s Best: 0.44Worst: 1.10 | 128.54t/s | 5 |
| 36 | gemini-2.0-flash-lite | 0.64 s Best: 0.33Worst: 2.46 | 135.07t/s | 10 |
| 37 | tongyi-xiaomi-analysis-pro | 0.64 s Best: 0.46Worst: 1.12 | 65.76t/s | 5 |
| 38 | qwen-plus | 0.66 s Best: 0.46Worst: 1.07 | 57.34t/s | 5 |
| 39 | qwen-plus-2025-12-01 | 0.67 s Best: 0.49Worst: 1.17 | 56.15t/s | 5 |
| 40 | qwen-plus-latest | 0.69 s Best: 0.52Worst: 1.16 | 57.95t/s | 5 |
| 41 | deepseek-v3.2 | 0.69 s Best: 0.52Worst: 1.36 | 34.40t/s | 10 |
| 42 | Qwen/Qwen3-235B-A22B-Instruct-2507 | 0.71 s Best: 0.60Worst: 0.89 | 15.27t/s | 5 |
| 43 | Qwen/Qwen3-Coder-30B-A3B-Instruct | 0.78 s Best: 0.57Worst: 1.27 | 38.23t/s | 5 |
| 44 | qwen-max-latest | 0.79 s Best: 0.48Worst: 1.47 | 37.07t/s | 5 |
| 45 | accounts/fireworks/models/gpt-oss-120b | 0.79 s Best: 0.60Worst: 1.00 | 144.71t/s | 10 |
| 46 | qwen/qwen2.5-7b | 0.80 s Best: 0.63Worst: 1.26 | 96.94t/s | 5 |
| 47 | qwen-plus-2025-12-01 | 0.81 s Best: 0.65Worst: 1.05 | 53.73t/s | 5 |
| 48 | gemini-2.0-flash | 0.86 s Best: 0.61Worst: 1.87 | 140.17t/s | 15 |
| 49 | deepseek | 0.92 s Best: 0.50Worst: 2.06 | 63.03t/s | 35 |
| 50 | deepseek | 0.92 s Best: 0.50Worst: 2.06 | 63.03t/s | 35 |