Nemotron 3 Nano is available through 19 API providers on LMSpeed. Compare API pricing from $0.0050 to $499.50 per million input tokens across providers. Free API access is offered by 1 provider. Context window: 256,000. In speed benchmarks, the fastest provider reaches 269 tok/s.
Также известна как
Сравните производительность по скорости и задержке у всех API-провайдеров.
| Провайдер | Скорость | Задержка | Тесты |
|---|---|---|---|
Futureppo nvidia/nemotron-3-nano-30b-a3b | 268.82 tok/s | 0.64s | 4 |
nvidia/nemotron-3-nano-30b-a3b | 246.87 tok/s | 1.15s | 5 |
Показано 1–2 из 2 провайдеров
gpt-oss
GPT-OSS is an open-weight language model family designed for self-hosted inference, research, and cost-efficient alternatives to proprietary GPT-class models.
minimax-m2-5
MiniMax M2.5 is MiniMax's flagship text model for coding and agents, with SOTA-level programming and agentic performance, improved token efficiency, and fast high-TPS API deployment.
deepseek-v3-2
DeepSeek V3.2 is an upgraded V3-series MoE model with stronger reasoning, coding, and math performance, widely available through OpenAI-compatible API relays.
minimax-m2-7
MiniMax M2.7 is a high-tier M2-series model tuned for complex reasoning, long-context dialogue, and production-grade API workloads.
qwen3-5
Alibaba Qwen3.5 is a Qwen3 generation model with improved reasoning, multilingual support, and efficient inference for chat, coding, and agent applications.
step-3-5-flash
Step 3.5 Flash is a fast and efficient language model, optimized for quick responses and high throughput.
Рейтинги основаны на тестах, предоставленных сообществом, и периодических зондах работоспособности. Носит рекомендательный характер, не является официальными данными.