A unified LLM API gateway providing access to multiple AI models through standardized endpoints.
| Model | Speed | Latency | Tests |
|---|---|---|---|
| qwen-3-235b-a22b-instruct-2507 | 449.95 t/s | 1.47s | 5 |
| qwen-3-coder-480b | 388.56 t/s | 1.41s | 5 |
| llama-4-scout-17b-16e-instruct | 334.77 t/s | 0.84s | 5 |
| gpt-oss-120b | 306.96 t/s | 1.64s | 5 |
| deepseek-v3.1:671b | 72.08 t/s | 1.88s | 5 |
| deepseek-v3.1:671b | 72.08 t/s | 1.88s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Nov 25, 12:02 PM | deepseek-v3.1:671b | 72.08 t/s | 1.88s |
| Nov 25, 12:01 PM | gpt-oss-120b | 306.96 t/s | 1.64s |
| Nov 25, 12:00 PM | qwen-3-235b-a22b-instruct-2507 | 449.95 t/s | 1.47s |
| Oct 31, 11:01 AM | llama-4-scout-17b-16e-instruct | 334.77 t/s | 0.84s |
| Oct 31, 11:00 AM | qwen-3-coder-480b | 388.56 t/s | 1.41s |