llama3-1
Also known as
Llama3 1 is available through 15 API providers on LMSpeed. Compare API pricing from $0.0000 to $75.00 per million input tokens across providers. Free API access is offered by 1 provider. In speed benchmarks, the fastest provider reaches 1956 tok/s.
Compare Llama3 1 API pricing across 14 providers. Input prices range from $0.0000 to $75.00 per million input. 91VIP offers the lowest rate at $0.0000/M.
| Provider | Model Variant | Input ($/M) | Output ($/M) | Speed (t/s) |
|---|---|---|---|---|
| 玄黄 | llama3.1-8b | Free | Free | — |
| 91VIP | llama3.1-8b | $0.0000 | $0.0001 | 1910.8 t/s |
| Futureppo | llama3.1-8b | $0.0000 | $0.0001 | 1956.2 t/s |
| 素墨API | llama3.1-8b | $0.010 | $0.010 | 672.0 t/s |
| uglycat | llama3.1-8b | $0.010 | $0.010 | — |
| 人人 API | llama3.1-8B | $0.010 | $0.010 | — |
| llama3.1-8b | $0.100 | $0.100 | — | |
| SMLC666 API | llama3.1-8b | $0.100 | $0.100 | — |
| Seamee API | llama3.1-8b | $0.100 | $0.100 | — |
| Synapse | llama3.1-8b | $1.00 | $1.00 | — |
| LLM API | llama3-1-405b | $20.00 | $20.00 | — |
| llama3.1-8B | $40.00 | $40.00 | 988.0 t/s | |
| llama3.1-8b | $75.00 | $75.00 | — | |
| llama3.1-8b | $75.00 | $75.00 | — | |
| Chlink API | llama3.1-8B | $75.00 | $75.00 | — |
Pricing data from provider public APIs
Llama3 1 is free to use through 1 provider with no per-token charges. These providers offer free API credits or a free tier:
| Provider | Speed (t/s) |
|---|---|
| 玄黄 | — |
Compare speed and latency performance across all API providers.
Showing 1-9 of 9 providers
| Provider | Speed | Latency | Tests |
|---|---|---|---|
Futureppo llama3.1-8b | 1956.17 tok/s | 0.38s | 10 |
91VIP llama3.1-8b | 1910.78 tok/s | 0.43s | 5 |
llama3.1-8b | 1829.04 tok/s | 0.58s | 10 |
llama3.1-8b | 1629.33 tok/s | 0.36s | 25 |
素墨API llama3.1-8B | 1421.44 tok/s | 0.94s | 10 |
素墨API 快速/llama3.1-8B | 1140.96 tok/s | 1.24s | 15 |
llama3.1-8B | 987.95 tok/s | 0.45s | 106 |
Koru API llama3.1-8B | 714.17 tok/s | 1.60s | 15 |
素墨API llama3.1-8b | 672.05 tok/s | 1.17s | 15 |
Latest benchmark results measuring API response speed and first-token latency.
| Time | Model | Speed | Latency |
|---|---|---|---|
| 04/03/2026, 14:47 | llama3.1-8b | 2556.19 tok/s | 0.55s |
| 04/03/2026, 14:47 | llama3.1-8b | 2347.73 tok/s | 0.59s |
| 04/03/2026, 14:47 | llama3.1-8b | 2548.74 tok/s | 0.56s |
| 04/03/2026, 14:47 | llama3.1-8b | 212.86 tok/s | 0.60s |
| 04/03/2026, 14:47 | llama3.1-8b | 2384.06 tok/s | 0.54s |
| 04/03/2026, 14:47 | llama3.1-8b | 1557.57 tok/s | 0.66s |
| 04/03/2026, 14:47 | llama3.1-8b | 2465.36 tok/s | 0.62s |
| 04/03/2026, 14:47 | llama3.1-8b | 2383.83 tok/s | 0.57s |
| 04/03/2026, 14:47 | llama3.1-8b | 154.51 tok/s | 0.59s |
| 04/03/2026, 14:47 | llama3.1-8b | 1679.53 tok/s | 0.53s |
| 03/30/2026, 18:50 | llama3.1-8B | 563.20 tok/s | 0.80s |
| 03/30/2026, 18:50 | llama3.1-8B | 1137.98 tok/s | 0.50s |
| 03/30/2026, 18:50 | llama3.1-8B | 1165.71 tok/s | 0.50s |
| 03/30/2026, 18:50 | llama3.1-8B | 606.41 tok/s | 0.49s |
| 03/30/2026, 18:50 | llama3.1-8B | 867.77 tok/s | 0.49s |
| 03/30/2026, 18:49 | llama3.1-8B | 83.02 tok/s | 6.16s |
| 03/30/2026, 18:49 | llama3.1-8B | 59.68 tok/s | 10.72s |
| 03/30/2026, 18:49 | llama3.1-8B | 1201.22 tok/s | 0.50s |
| 03/30/2026, 18:49 | llama3.1-8B | 280.86 tok/s | 0.47s |
| 03/30/2026, 18:49 | llama3.1-8B | 784.80 tok/s | 0.48s |
