A unified LLM API gateway offering access to multiple AI models with competitive pricing and stable endpoints.
| Model | Speed | Latency | Tests |
|---|---|---|---|
| llama3.1-8B | 38079.61 t/s | 0.47s | 45 |
| gpt-oss-120b | 1659.82 t/s | 0.79s | 30 |
| mercury-2 | 1653.71 t/s | 2.50s | 5 |
| gpt-oss-120b-medium | 255.84 t/s | 1.45s | 5 |
| deepseek-ai/DeepSeek-R1-0528-fast | 240.66 t/s | 1.34s | 5 |
| deepseek-ai/DeepSeek-V3-0324-fast | 210.41 t/s | 2.42s | 5 |
| gemini-2.5-flash-lite | 172.44 t/s | 1.26s | 5 |
| gemini-3-flash-preview | 171.04 t/s | 5.62s | 5 |
| gpt-5.2-codex(low) | 117.88 t/s | 5.21s | 5 |
| openai/gpt-oss-20b:free | 101.53 t/s | 9.16s | 5 |
| zai-org/GLM-4.5-Air | 68.81 t/s | 0.78s | 5 |
| zai-org/GLM-4.5-Air | 68.81 t/s | 0.78s | 5 |
| gpt-5.3-codex(low) | 62.69 t/s | 3.79s | 5 |
| gpt-5.2 | 60.05 t/s | 3.33s | 5 |
| claude-opus-4-6-thinking | 41.59 t/s | 2.23s | 5 |
| claude-opus-4-6 | 40.28 t/s | 3.74s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Feb 26, 01:29 PM | mercury-2 | 1653.71 t/s | 2.50s |
| Feb 21, 02:09 PM | llama3.1-8B | 51034.53 t/s | 0.36s |
| Feb 21, 02:09 PM | llama3.1-8B | 36452.89 t/s | 0.36s |
| Feb 21, 02:09 PM | llama3.1-8B | 39265.09 t/s | 0.37s |
| Feb 21, 02:09 PM | llama3.1-8B | 23023.86 t/s | 0.48s |
| Feb 21, 02:00 PM | llama3.1-8B | 49889.56 t/s | 0.58s |
| Feb 21, 02:00 PM | llama3.1-8B | 28210.25 t/s | 0.40s |
| Feb 21, 02:00 PM | llama3.1-8B | 38943.62 t/s | 0.49s |
| Feb 21, 02:00 PM | llama3.1-8B | 38545.90 t/s | 0.53s |
| Feb 21, 01:55 PM | llama3.1-8B | 37350.79 t/s | 0.60s |