| Model | Speed | Latency | Tests |
|---|---|---|---|
| nvidia/nemotron-3-super-120b-a12b | 141869.96 t/s | 29.75s | 5 |
| openai/gpt-oss-20b | 239.61 t/s | 10.88s | 5 |
| openai/gpt-oss-120b | 176.64 t/s | 6.57s | 60 |
| qwen/qwen3-next-80b-a3b-instruct | 118.96 t/s | 0.58s | 10 |
| qwen/qwen3-next-80b-a3b-instruct | 118.96 t/s | 0.58s | 10 |
| qwen/qwen3-next-80b-a3b-thinking | 114.52 t/s | 10.51s | 10 |
| qwen/qwen3-next-80b-a3b-thinking | 114.52 t/s | 10.51s | 10 |
| meta/llama-4-maverick-17b-128e-instruct | 100.41 t/s | 0.21s | 15 |
| mistralai/mixtral-8x22b-instruct-v0.1 | 89.66 t/s | 0.22s | 5 |
| deepseek-ai/deepseek-r1 | 87.63 t/s | 8.96s | 15 |
| minimaxai/minimax-m2.1 | 86.36 t/s | 2.88s | 30 |
| marin/marin-8b-instruct | 84.25 t/s | 0.44s | 5 |
| stepfun-ai/step-3.5-flash | 81.82 t/s | 4.79s | 10 |
| microsoft/phi-4-mini-flash-reasoning | 74.19 t/s | 0.46s | 5 |
| moonshotai/kimi-k2.5 | 70.81 t/s | 8.11s | 10 |
| minimaxai/minimax-m2.5 | 65.62 t/s | 4.40s | 25 |
| nvidia/llama-3.3-nemotron-super-49b-v1.5 | 57.55 t/s | 11.11s | 10 |
| google/gemma-3-27b-it | 57.40 t/s | 0.20s | 10 |
| ai21labs/jamba-1.5-large-instruct | 55.60 t/s | 0.29s | 10 |
| stockmark/stockmark-2-100b-instruct | 55.31 t/s | 0.74s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Apr 11, 02:08 AM | moonshotai/kimi-k2.5 | 69.24 t/s | 8.16s |
| Apr 11, 02:01 AM | z-ai/glm5 | 28.19 t/s | 14.62s |
| Apr 10, 12:07 PM | qwen/qwen3-coder-480b-a35b-instruct | 52.06 t/s | 1.09s |
| Apr 10, 12:06 PM | moonshotai/kimi-k2.5 | 72.37 t/s | 8.06s |
| Apr 10, 11:58 AM | minimaxai/minimax-m2.5 | 62.31 t/s | 2.44s |
| Apr 8, 05:11 PM | moonshotai/kimi-k2-thinking | 30.41 t/s | 20.72s |
| Apr 8, 04:12 PM | moonshotai/kimi-k2-thinking | 25.84 t/s | 29.77s |
| Apr 7, 06:19 AM | minimaxai/minimax-m2.5 | 57.87 t/s | 5.92s |
| Apr 7, 06:14 AM | qwen/qwen3.5-397b-a17b | 27.27 t/s | 3.21s |
| Apr 7, 06:08 AM | moonshotai/kimi-k2-thinking | 71.21 t/s | 7.92s |