A unified API gateway providing access to multiple large language models and AI services through standardized endpoints.
| Model | Speed | Latency | Tests |
|---|---|---|---|
| allam-2-7b | 337.15 t/s | 0.27s | 5 |
| gemini-2.5-flash-lite-ts | 214.83 t/s | 1.17s | 5 |
| lgai/exaone-3-5-32b-instruct | 100.84 t/s | 1.08s | 5 |
| Qwen/Qwen3-32B-FP8 | 75.12 t/s | 1.01s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Aug 9, 08:16 AM | gemini-2.5-flash-lite-ts | 214.83 t/s | 1.17s |
| Aug 9, 08:14 AM | lgai/exaone-3-5-32b-instruct | 100.84 t/s | 1.08s |
| Aug 9, 08:11 AM | Qwen/Qwen3-32B-FP8 | 75.12 t/s | 1.01s |
| Aug 9, 08:10 AM | allam-2-7b | 337.15 t/s | 0.27s |