Llama 3.3 by Meta is available through 3 API providers on LMSpeed. Compare API pricing from $1.03 to $75.00 per million input tokens across providers. In speed benchmarks, the fastest provider reaches 985 tok/s.
Meta Llama 3.3 is an updated Llama 3 open model with improved instruction following, multilingual support, and efficient inference.
Также известна как
GPT Load (Shiho) llama-3.3-70b |
984.90 tok/s |
0.45s |
| 20 |
Показано 1–1 из 1 провайдеров
gpt-oss
GPT-OSS is an open-source language model offering advanced reasoning, code generation, and multimodal capabilities.
qwen3
Alibaba Qwen3 is the Qwen family's flagship LLM series with dense and MoE variants, seamless thinking/non-thinking modes, and leading open-source performance in math, code, and agent tasks.
gemini-2-5-pro
Google Gemini 2.5 Pro is Google advanced multimodal model with a 1M-token context window, strong STEM reasoning, and native support for images, audio, and video understanding.
deepseek-v3
DeepSeek V3 is DeepSeek flagship MoE language model with 671B total parameters, delivering strong performance in reasoning, coding, and multilingual tasks at competitive inference cost.
gemini-2-5-flash
Google Gemini 2.5 Flash is a fast and efficient language model in the Gemini series, optimized for quick responses and high throughput.
deepseek-r1
DeepSeek R1 is a reasoning-focused language model in the DeepSeek series, designed for complex reasoning, problem-solving, and analytical tasks.
Рейтинги основаны на тестах, предоставленных сообществом, и периодических зондах работоспособности. Носит рекомендательный характер, не является официальными данными.