Provides AI inference and training APIs leveraging Cerebras hardware for large-scale model deployment.
| Model | Speed | Latency | Tests |
|---|---|---|---|
| llama3.1-8b | 2142.09 t/s | 0.19s | 5 |
| gpt-oss-120b | 1920.13 t/s | 0.54s | 5 |
| llama-3.3-70b | 1532.55 t/s | 0.25s | 5 |
| qwen-3-235b-a22b-instruct-2507 | 851.89 t/s | 12.09s | 10 |
| zai-glm-4.7 | 454.25 t/s | 3.57s | 5 |
| Time | Model | Speed | Latency |
|---|---|---|---|
| Jan 13, 04:32 PM | zai-glm-4.7 | 454.25 t/s | 3.57s |
| Dec 25, 02:21 PM | qwen-3-235b-a22b-instruct-2507 | 848.10 t/s | 12.10s |
| Dec 25, 02:12 PM | qwen-3-235b-a22b-instruct-2507 | 855.69 t/s | 12.09s |
| Dec 25, 02:06 PM | llama3.1-8b | 2142.09 t/s | 0.19s |
| Dec 25, 02:06 PM | gpt-oss-120b | 1920.13 t/s | 0.54s |
| Dec 25, 02:02 PM | llama-3.3-70b | 1532.55 t/s | 0.25s |