Back to models
publish

Llama 3.1

llama-3.1

Avg speed
54.26t/s
First token
1.19s
Total tests
15
Providers
2
Variants
3

Variants

Showing 1-3 of 3 providers

VariantSpeedLatencyTests
59.42 t/s
3.12s
5
integrate.api.nvidia.com
52.19 t/s
0.23s
5
integrate.api.nvidia.com

Recent test records

15 records
TimeModelSpeedLatency
Feb 16, 03:23 AMllama-3.1-nemotron-nano-4b-v1.1
61.80 t/s
3.89s
Feb 16, 03:23 AMllama-3.1-nemotron-nano-4b-v1.1
69.04 t/s
2.49s
Feb 16, 03:23 AMllama-3.1-nemotron-nano-4b-v1.1
51.18 t/s
0.23s
5
35.89 t/s
3.54s
Feb 16, 03:23 AMllama-3.1-nemotron-nano-4b-v1.1
61.56 t/s
2.93s
Feb 16, 03:23 AMllama-3.1-nemotron-nano-4b-v1.1
68.81 t/s
2.74s
Aug 21, 12:17 AMnvidia/llama-3.1-nemotron-70b-instruct
52.14 t/s
0.40s
Aug 21, 12:17 AMnvidia/llama-3.1-nemotron-70b-instruct
52.30 t/s
0.17s
Aug 21, 12:17 AMnvidia/llama-3.1-nemotron-70b-instruct
52.13 t/s
0.18s
Aug 21, 12:17 AMnvidia/llama-3.1-nemotron-70b-instruct
52.10 t/s
0.16s
Aug 21, 12:17 AMnvidia/llama-3.1-nemotron-70b-instruct
52.28 t/s
0.22s
Aug 1, 10:55 AMmeta/llama-3.1-70b-instruct
39.00 t/s
0.40s
Aug 1, 10:55 AMmeta/llama-3.1-70b-instruct
38.29 t/s
0.20s
Aug 1, 10:55 AMmeta/llama-3.1-70b-instruct
53.02 t/s
0.17s
Aug 1, 10:55 AMmeta/llama-3.1-70b-instruct
75.07 t/s
0.18s
Aug 1, 10:55 AMmeta/llama-3.1-70b-instruct
50.52 t/s
0.21s
llama-3.1-nemotron-nano-4b-v1.1
nvidia/llama-3.1-nemotron-70b-instruct
meta/llama-3.1-70b-instruct