A unified API gateway providing access to multiple large language models with direct connectivity in China.

| Model | Speed | Latency | Tests |
|---|---|---|---|
| mistral-small-latest | 576.27 t/s | 1.51s | 5 |
| gemini-2.5-flash-lite-preview-06-17 | 381.52 t/s | 0.84s | 5 |
| gemini-2.5-flash-lite-nothinking | 318.15 t/s | 0.99s | 10 |
| gpt-5-nano | 238.85 t/s | 7.16s | 10 |
| gemini-2.5-pro-thinking | 165.57 t/s | 4.10s | 5 |
| gemini-2.5-pro | 101.10 t/s | 17.38s | 5 |
| grok-4-fast | 84.15 t/s | 4.01s | 5 |
| qwq-plus-latest | 43.91 t/s | 19.49s | 5 |
| qwq-plus-latest | 43.91 t/s | 19.49s | 5 |
| qwen3-coder-480b-a35b-instruct | 38.92 t/s | 0.96s | 15 |
| qwen3-coder-480b-a35b-instruct | 38.92 t/s | 0.96s | 15 |
| gpt-4o-mini | 38.25 t/s | 2.89s | 15 |
| claude-sonnet-4-5-20250929 | 37.74 t/s | 4.77s | 5 |
| gemini-2.5-pro-exp-03-25 | 30.32 t/s | 4.01s | 20 |
| deepseek-reasoner | 26.79 t/s | 23.92s | 5 |
| deepseek-v3-0324 | 22.64 t/s | 1.52s | 10 |
| qwen-max-latest | 9.20 t/s | 0.97s | 10 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Nov 1, 01:31 AM | gemini-2.5-flash-lite-nothinking | 295.30 t/s | 0.99s |
| Nov 1, 01:30 AM | gemini-2.5-flash-lite-nothinking | 341.00 t/s | 0.99s |
| Nov 1, 01:28 AM | mistral-small-latest | 576.27 t/s | 1.51s |
| Nov 1, 01:27 AM | grok-4-fast | 84.15 t/s | 4.01s |
| Nov 1, 01:26 AM | gpt-5-nano | 213.51 t/s | 6.78s |
| Nov 1, 01:24 AM | gpt-5-nano | 264.19 t/s | 7.53s |
| Oct 1, 09:06 AM | claude-sonnet-4-5-20250929 | 37.74 t/s | 4.77s |
| Aug 6, 03:41 PM | gemini-2.5-pro | 101.10 t/s | 17.38s |
| Aug 6, 03:41 PM | gemini-2.5-pro-thinking | 165.57 t/s | 4.10s |
| Aug 6, 03:39 PM | qwen3-coder-480b-a35b-instruct | 59.63 t/s | 1.62s |