Fireworks AI provides a cloud platform for running and fine-tuning open-source AI models with optimized inference for production applications.
| Model | Speed | Latency | Tests |
|---|---|---|---|
| accounts/fireworks/models/gpt-oss-20b | 359.18 t/s | 1.14s | 10 |
| accounts/fireworks/models/minimax-m2p1 | 185.81 t/s | 1.91s | 10 |
| accounts/fireworks/models/gpt-oss-120b | 144.71 t/s | 0.79s | 10 |
| accounts/fireworks/models/kimi-k2-thinking | 127.75 t/s | 3.47s | 5 |
| accounts/fireworks/models/deepseek-v3p2 | 114.61 t/s | 0.49s | 5 |
| accounts/fireworks/models/deepseek-v3p1-terminus | 112.67 t/s | 0.89s | 5 |
| accounts/fireworks/models/glm-4p7 | 105.59 t/s | 10.50s | 20 |
| accounts/fireworks/models/qwen3-235b-a22b-instruct-2507 | 78.64 t/s | 0.62s | 5 |
| accounts/fireworks/models/glm-4p6 | 69.26 t/s | 10.03s | 5 |
| accounts/fireworks/models/qwen3-vl-235b-a22b-thinking | 51.54 t/s | 1.25s | 5 |
| accounts/fireworks/models/qwen3-vl-235b-a22b-thinking | 51.54 t/s | 1.25s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Jan 13, 04:45 PM | accounts/fireworks/models/glm-4p7 | 121.26 t/s | 10.59s |
| Jan 13, 04:39 PM | accounts/fireworks/models/glm-4p7 | 120.41 t/s | 8.25s |
| Jan 13, 04:22 PM | accounts/fireworks/models/glm-4p7 | 82.21 t/s | 11.73s |
| Jan 13, 02:55 AM | accounts/fireworks/models/gpt-oss-120b | 136.18 t/s | 0.85s |
| Jan 13, 01:51 AM | accounts/fireworks/models/gpt-oss-120b | 153.24 t/s | 0.74s |
| Jan 13, 01:23 AM | accounts/fireworks/models/minimax-m2p1 | 192.15 t/s | 2.12s |
| Jan 13, 01:22 AM | accounts/fireworks/models/kimi-k2-thinking | 127.75 t/s | 3.47s |
| Jan 13, 01:21 AM | accounts/fireworks/models/glm-4p6 | 69.26 t/s | 10.03s |
| Jan 13, 01:17 AM | accounts/fireworks/models/deepseek-v3p2 | 114.61 t/s | 0.49s |
| Jan 13, 01:15 AM | accounts/fireworks/models/qwen3-vl-235b-a22b-thinking | 51.54 t/s | 1.25s |