A unified LLM API gateway providing access to multiple AI models through standardized endpoints.

| Model | Speed | Latency | Tests |
|---|---|---|---|
| llama3.1-8B | 14956.18 t/s | 1.60s | 15 |
| qwen3.5-9b | 200.97 t/s | 6.81s | 5 |
| qwen3.5-27b | 73.93 t/s | 13.78s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Mar 30, 06:50 PM | llama3.1-8B | 14203.74 t/s | 0.56s |
| Mar 30, 06:49 PM | llama3.1-8B | 15075.90 t/s | 3.67s |
| Mar 30, 06:46 PM |
| qwen3.5-27b |
73.93 t/s |
13.78s |
| Mar 30, 06:44 PM | qwen3.5-9b | 200.97 t/s | 6.81s |
| Mar 30, 06:44 PM | llama3.1-8B | 15588.89 t/s | 0.57s |