Hugging Face Router provides intelligent model routing across Inference Providers, offering OpenAI-compatible API access to open-source models.
Hugging Face offers 8 LLM API models.
Speed benchmark average: 138 tok/s.

router.huggingface.cohuggingface.coRankings are based on community-submitted tests and periodic health probes. Advisory only, not official data.
| Model | Speed | Latency | Tests |
|---|---|---|---|
108.96 tok/s | 7.99s | 5 | |
416.19 tok/s | 0.26s | 5 | |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Apr 1, 06:16 AM | Qwen/Qwen3.5-9B | 108.96 tok/s | 7.99s |
| Jan 14, 02:25 AM | meta-llama/Llama-3.3-70B-Instruct | 416.19 tok/s | 0.26s |
| Aug 13, 03:29 PM | moonshotai/Kimi-K2-Instruct:novita | 48.81 tok/s | 1.21s |
| Aug 13, 03:26 PM | Qwen/Qwen3-Coder-480B-A35B-Instruct:novita | 63.59 tok/s | 1.06s |
| Aug 13, 02:45 PM | openai/gpt-oss-20b:novita | 155.67 tok/s | 3.56s |
| Aug 13, 02:41 PM | openai/gpt-oss-120b:novita | 240.32 tok/s | 1.26s |
| Aug 13, 02:35 PM | zai-org/GLM-4.5:novita | 35.36 tok/s | 1.39s |
| Aug 13, 03:46 AM | Qwen/Qwen3-235B-A22B:novita | 34.12 tok/s | 1.12s |
48.81 tok/s |
1.21s |
| 5 |
63.59 tok/s | 1.06s | 5 |
155.67 tok/s | 3.56s | 5 |
240.32 tok/s | 1.26s | 5 |
35.36 tok/s | 1.39s | 5 |
34.12 tok/s | 1.12s | 5 |