LogoLMSpeed
  • Home
  • Free
  • Providers
  • Docs
LogoLMSpeed
LogoLMSpeed

The best API speed test tool

GitHubGitHubTwitterX (Twitter)Email
Product
  • Features
  • Pricing
  • FAQ
Leaderboard
  • Overview
  • Speed Ranking
  • Latency Ranking
  • Health Ranking
Models
  • All Models
  • GPT
  • Claude
  • Gemini
  • DeepSeek
  • Llama
  • Qwen
Free Models
  • All Free Models
  • Free GPT
  • Free Claude
  • Free Gemini
  • Free DeepSeek
  • Free Llama
  • Free Qwen
Resources
  • Speed Test
  • Provider Directory
  • Documentation
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 LMSpeed All Rights Reserved.Made by Nexmoe with ❤️
首页交流 QQ 群:1034193296,欢迎中转站站长加入讨论 AI 最热话题、newapi、openclaw 等,获取最新测速动态与反馈支持。
ModelScope logo

ModelScope

ModelScope provides model inference API access to a wide range of open-source AI models via OpenAI-compatible endpoints.

Categories

中转站
QwenQwen3 Next InstructQwenQwen3QwenDeepSeek R1 Distill QwenQwenQwen3 Coder InstructMinimaxMiniMax-M2.5QwenQwen2.5 InstructMoonshotAIKimi K2.5DeepSeekDeepSeek V3DeepSeekDeepSeek R1QwenQwen3 InstructDeepSeekDeepSeek V3.2MetaAIDeepSeek R1 Distill LlamaMistralMistral Small Instruct

ModelScope offers 18 LLM API models.

Speed benchmark average: 58 tok/s.

ModelScope is an API aggregator, offering models from multiple vendors.

ModelScope interface preview
Avg Speed58.21 tok/s
Latency5.76 s
Total Tests108
Models18
Updated4/16/2026
Created At12/7/2025
Website

API Endpoints

  • api-inference.modelscope.cn
  • ms-ens-1f4a9445-d0e7.api-inference.modelscope.cn
  • ms-ens-327b7543-e27c.api-inference.modelscope.cn
  • ms-ens-51815792-b22c.api-inference.modelscope.cn

Supported Models

ModelSpeedLatencyTests
QwenQwen/Qwen3-Next-80B-A3B-Instruct
158.96 tok/s
0.96s
10
QwenQwen/Qwen3-4B
126.44 tok/s
4.27s
5
QwenQwen/Qwen3-30B-A3B
123.38 tok/s
6.13s
5
QwenQwen/Qwen3-8B
76.12 tok/s
9.33s
5
Qwendeepseek-ai/DeepSeek-R1-Distill-Qwen-32B
70.69 tok/s
13.55s
5
QwenQwen/Qwen3-Coder-480B-A35B-Instruct
61.17 tok/s
0.94s
5
MinimaxMiniMax/MiniMax-M2.5
52.75 tok/s
7.49s
5
QwenQwen/QVQ-72B-Preview
44.92 tok/s
1.16s
8
QwenQwen/Qwen2.5-7B-Instruct
43.96 tok/s
0.91s
5
MoonshotAImoonshotai/Kimi-K2.5
43.12 tok/s
1.11s
5
DeepSeekdeepseek-ai/DeepSeek-V3
35.26 tok/s
1.77s
5
DeepSeekdeepseek-ai/DeepSeek-R1-0528
32.22 tok/s
18.61s
10
QwenQwen/Qwen3-235B-A22B-Instruct-2507
31.78 tok/s
3.02s
5
DeepSeekdeepseek-ai/DeepSeek-V3.2
29.48 tok/s
1.88s
5
OpenAIkgiser/gpu_gpt_5
28.32 tok/s
1.14s
10
MetaAIdeepseek-ai/DeepSeek-R1-Distill-Llama-70B
27.45 tok/s
27.33s
5
QwenTeichAI/Qwen3-30B-A3B-Thinking-2507-Claude-4.5-Sonnet-High-Reasoning-Distill-GGUF
13.07 tok/s
2.17s
5
Mistralmistralai/Mistral-Small-Instruct-2409
11.68 tok/s
1.15s
5

Leaderboard Rankings

Latency
0.94 s#29/100
OverviewPerformance18PricingTests108HealthEmbed