LogoLMSpeed
  • Home
  • Free
  • Providers
  • Docs
LogoLMSpeed
LogoLMSpeed

The best API speed test tool

GitHubGitHubTwitterX (Twitter)Email
Product
  • Features
  • Pricing
  • FAQ
Leaderboard
  • Overview
  • Speed Ranking
  • Latency Ranking
  • Health Ranking
Models
  • All Models
  • GPT
  • Claude
  • Gemini
  • DeepSeek
  • Llama
  • Qwen
Free Models
  • All Free Models
  • Free GPT
  • Free Claude
  • Free Gemini
  • Free DeepSeek
  • Free Llama
  • Free Qwen
Resources
  • Speed Test
  • Provider Directory
  • Documentation
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 LMSpeed All Rights Reserved.Made by Nexmoe with ❤️
首页交流 QQ 群:1034193296,欢迎中转站站长加入讨论 AI 最热话题、newapi、openclaw 等,获取最新测速动态与反馈支持。
NVIDIA NIM logo

NVIDIA NIM

NVIDIA NIM provides optimized AI model inference APIs for LLMs, vision, and embedding models through NVIDIA cloud infrastructure.

Categories

中转站
OpenAIGPT-OSSQwenQwen3 Next InstructQwenQwen3 Next ThinkingMetaAILlama 4 Maverick 128e InstructDeepSeekDeepSeek R1MinimaxMiniMax-M2.1GeminiStep 3.5 FlashMoonshotAIKimi K2.5MinimaxMiniMax-M2.5MetaAILlama 3.3 Nemotron Super V1.5GemmaGemma 3 ItChatGLMGlm4 7QwenQwen3 Coder InstructMetaAILlama 3.1 Nemotron InstructMetaAILlama 3.1 InstructMoonshotAIKimi K2 InstructMetaAILlama 3.1 Nemotron Ultra v1MoonshotAIKimi K2 ThinkingGemmaGemma 2 ItQwenQwen3 5QwenDeepSeek R1 Distill QwenDeepSeekDeepSeek V3.1MetaAILlama3 InstructMistralMistral Small InstructQwenQwen3ChatGLMGlm5MetaAILlama 3.1 Swallow Instruct V0.1MetaAIDracarys Llama 3 1 InstructDeepSeekDeepSeek V3.2MetaAILlama 3.2 Vision Instruct

NVIDIA NIM offers 42 LLM API models.

Speed benchmark average: 63 tok/s.

NVIDIA NIM is an API aggregator, offering models from multiple vendors.

NVIDIA NIM interface preview
Avg Speed63.48 tok/s
Latency6.53 s
Total Tests650
Models42
Updated4/16/2026
Created At8/13/2025
Website

API Endpoints

  • Historical / Unverified
    https://www.nvidia.com
  • Historical / Unverified
    https://integrate.api.nvidia.com

Recent Test Records

TimeModelSpeedLatency
Apr 15, 06:51 PM
Qwenqwen/qwen3.5-122b-a10b
86.95 tok/s
0.29s
Apr 15, 05:58 PM
Minimaxminimaxai/minimax-m2.5
73.00 tok/s
0.97s
Apr 13, 11:25 AM
Qwenqwen/qwen3-coder-480b-a35b-instruct
73.43 tok/s
3.16s
Apr 11, 02:08 AM
MoonshotAImoonshotai/kimi-k2.5
69.24 tok/s
8.16s
Apr 11, 02:01 AM
ChatGLMz-ai/glm5
28.19 tok/s
14.62s
Apr 10, 12:07 PM
Qwenqwen/qwen3-coder-480b-a35b-instruct
52.06 tok/s
1.09s
Apr 10, 12:06 PM
MoonshotAImoonshotai/kimi-k2.5
72.37 tok/s
8.06s
Apr 10, 11:58 AM
Minimaxminimaxai/minimax-m2.5
62.31 tok/s
2.44s
Apr 8, 05:11 PM
MoonshotAImoonshotai/kimi-k2-thinking
30.41 tok/s
20.72s
Apr 8, 04:12 PM
MoonshotAImoonshotai/kimi-k2-thinking
25.84 tok/s
29.77s

Leaderboard Rankings

Speed
201.8 tokens/s#9/100
Latency
0.19 s#1/100
OverviewPerformance42PricingTests650HealthEmbed