Provides unified API access to over 300 AI models from multiple providers, including OpenAI, Claude, and Gemini.
| Model | Speed | Latency | Tests |
|---|---|---|---|
| BLOOMZ-7B | 875.69 t/s | 2.46s | 5 |
| gemini-1.5-flash-8b | 275.08 t/s | 1.28s | 10 |
| gemini-1.5-flash-latest | 252.88 t/s | 1.26s | 10 |
| gemini-1.5-flash-002 | 239.65 t/s | 1.62s | 5 |
| gemini-2.0-flash-thinking-exp-01-21 | 230.82 t/s | 7.47s | 5 |
| gemini-2.0-flash-lite-preview-02-05 | 188.38 t/s | 1.06s | 15 |
| gemini-2.0-flash | 177.30 t/s | 0.98s | 20 |
| gemini-1.5-flash | 158.26 t/s | 0.91s | 5 |
| o3-mini | 148.38 t/s | 7.42s | 15 |
| gpt-4o | 135.41 t/s | 1.43s | 5 |
| gemini-2.0-flash-exp | 125.02 t/s | 1.40s | 5 |
| qwen-72b | 102.22 t/s | 3.11s | 5 |
| qwen2.5-72b-instruct | 92.64 t/s | 2.57s | 10 |
| qwen2.5-72b-instruct | 92.64 t/s | 2.57s | 10 |
| gpt-4o-mini | 78.22 t/s | 1.05s | 5 |
| o3-mini-all | 74.59 t/s | 5.09s | 10 |
| Phi-4 | 43.07 t/s | 1.32s | 10 |
| claude-3-5-sonnet-20241022 | 39.36 t/s | 4.75s | 5 |
| qwen2.5-7b-instruct | 34.15 t/s | 1.01s | 5 |
| qwen2.5-7b-instruct | 34.15 t/s | 1.01s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Apr 6, 11:36 AM | gemini-2.0-flash | 193.77 t/s | 0.86s |
| Apr 6, 11:36 AM | gemini-2.0-flash | 198.39 t/s | 0.92s |
| Apr 6, 11:35 AM | gemini-2.0-flash | 191.79 t/s | 1.06s |
| Feb 20, 08:16 AM | gemini-1.5-flash-latest | 174.31 t/s | 1.21s |
| Feb 20, 07:49 AM | gemini-1.5-flash-latest | 331.45 t/s | 1.32s |
| Feb 20, 07:45 AM | Phi-4 | 41.62 t/s | 0.90s |
| Feb 20, 07:44 AM | glm-4-flash | 25.16 t/s | 1.30s |
| Feb 14, 04:31 PM | qwen-72b | 102.22 t/s | 3.11s |
| Feb 14, 04:10 PM | gemini-2.0-flash-lite-preview-02-05 | 175.81 t/s | 0.87s |
| Feb 14, 04:08 PM | qwen2.5-72b-instruct | 96.05 t/s | 2.68s |