LogoLMSpeed
  • Home
  • Free
  • Models
  • Providers
  • Docs
LogoLMSpeed
LogoLMSpeed

The best API speed test tool

GitHubGitHubTwitterX (Twitter)Email
Product
  • Features
  • Pricing
  • FAQ
Leaderboard
  • Overview
  • Speed Ranking
  • Latency Ranking
  • Health Ranking
  • Input Price
  • Output Price
  • Reasoning
  • Coding
Models
  • All Models
  • GPT
  • Claude
  • Gemini
  • DeepSeek
  • Llama
  • Qwen
Free Models
  • All Free Models
  • Free GPT
  • Free Claude
  • Free Gemini
  • Free DeepSeek
  • Free Llama
  • Free Qwen
Resources
  • Speed Test
  • Provider Directory
  • Documentation
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 LMSpeed All Rights Reserved.Made by Nexmoe with ❤️

Model Library

Browse canonical models across providers with performance and coverage highlights.

Visible models
301
Active models
301
Providers covered
555
Model variants
29913
Showing 1-24 of 301 models

QwenQwen-72B

Qwen-72B is a 72-billion-parameter language model in Alibaba Cloud's Qwen (Tongyi Qianwen) family, designed for multilingual instruction following, reasoning, and general-purpose text generation.

Input price

From $2.64/M

Avg speed

—

First token

—

Providers

15

+5 more

QwenQwen3 Instruct

Alibaba Qwen3 Instruct is an instruction-tuned variant in the Qwen series, optimized for following instructions and conversational tasks.

Input price

From $0.010/M

Avg speed

—

First token

—

Providers

7

DeepSeekDeepSeek V4 Pro

DeepSeek V4 Pro is a large language model in the DeepSeek series, offering advanced reasoning, code generation, and multimodal capabilities.

Input price+9 free

From $0.0007/M

Avg speed

43 t/s

First token

8.51s

Providers

176

+129 more
Jun 1

DeepSeekDeepSeek V4 Flash

DeepSeek V4 Flash is a fast, cost-efficient language model in the DeepSeek V4 family, optimized for low-latency chat, coding assistance, and high-throughput API workloads while retaining strong reasoning quality.

Input price+9 free

From $0.0007/M

Avg speed

71 t/s

First token

5.89s

Providers

179

+131 more
Jun 1

MiMo-V2.5-TTS-VoiceDesign

Xiaomi MiMo-V2.5-TTS-VoiceDesign is the voice-design variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling custom voice creation through stylistic prompts. Pricing: free during the limited-time launch period (0x token consumption).

Input price+1 free

From $0.0071/M

Avg speed

—

First token

—

Providers

38

+19 more

MiMo-V2.5-TTS-VoiceClone

Xiaomi MiMo-V2.5-TTS-VoiceClone is the voice-cloning variant of MiMo-V2.5-TTS on the Xiaomi MiMo API platform, enabling speech synthesis with cloned target voices. Pricing: free during the limited-time launch period (0x token consumption).

Input price+1 free

From $0.0071/M

Avg speed

—

First token

—

Providers

41

+22 more

MiMo-V2.5-TTS

Xiaomi MiMo-V2.5-TTS is the text-to-speech model in the V2.5 series on the Xiaomi MiMo API platform, providing high-quality speech synthesis. Pricing: free during the limited-time launch period (0x token consumption).

Input price+1 free

From $0.0071/M

Avg speed

—

First token

—

Providers

41

+22 more

MiMo-V2.5

Xiaomi MiMo-V2.5 is a native omnimodal sparse MoE model (310B total, 15B active) with unified text, image, video, and audio understanding, built on the MiMo-V2-Flash backbone with dedicated vision and audio encoders. It supports up to 1M tokens of context, strong agentic workflows, and open weights on Hugging Face.

Input price+1 free

From $0.0096/M

Avg speed

85 t/s

First token

3.14s

Providers

88

+64 more
Jun 2

QwenQwen3 Instant

Alibaba Qwen3 Instant is a fast and efficient language model in the Qwen series, optimized for quick responses and high throughput.

Input price

From $0.010/M

Avg speed

—

First token

—

Providers

3

MiMo-V2.5-Pro

Xiaomi MiMo-V2.5-Pro is a large open-source language model in the MiMo series, offering advanced reasoning and general-purpose capabilities.

Input price+1 free

From $0.0000/M

Avg speed

48 t/s

First token

5.35s

Providers

91

+68 more
Jun 2

MiniMax M2 Her

MiniMax M2 Her is a language model in the MiniMax series, offering general-purpose reasoning, dialogue, and text generation capabilities.

Input price

From $1.03/M

Avg speed

—

First token

—

Providers

4

MiMo-V2-TTS

Xiaomi MiMo-V2-TTS is a text-to-speech model in the MiMo series, optimized for natural speech synthesis and voice generation tasks.

Input price+4 free

From $0.0071/M

Avg speed

—

First token

—

Providers

38

+20 more

ClaudeClaude Opus 4.7 Max

Anthropic Claude Opus 4.7 Max is a high-capability language model in the Claude series, offering enhanced reasoning, code generation, and multimodal capabilities.

Input price

From $0.0014/M

Avg speed

—

First token

—

Providers

24

+12 more

ClaudeClaude Opus 4.7

Anthropic Claude Opus 4.7 targets frontier-level analysis, complex coding, and autonomous workflows that require deep multi-step reasoning.

Input price+1 free

From $0.0014/M

Avg speed

43 t/s

First token

3.86s

Providers

173

+126 more
May 27

DeepSeekDeepSeek V4

DeepSeek V4 is a large language model in the DeepSeek series, offering advanced reasoning, code generation, and multimodal capabilities.

Input price

From $0.050/M

Avg speed

47 t/s

First token

2.49s

Providers

12

+5 more
Apr 20

QwenQwen1.8B Long Context

Alibaba Qwen1.8B Long Context is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.

Input price

From $0.050/M

Avg speed

—

First token

—

Providers

16

+9 more

QwenQwen1.8B

Alibaba Qwen1.8B is a compact language model in the Qwen series, optimized for low-latency responses and efficient inference.

Input price

From $0.050/M

Avg speed

—

First token

—

Providers

16

+9 more

MetaAIAion RP Llama 3.1

Aion RP Llama 3.1 is a roleplay-tuned variant in the Aion series, optimized for character-driven dialogue and creative writing.

Input price

From $0.401/M

Avg speed

—

First token

—

Providers

6

MetaAIHermes 3 Llama 3.1

Nous Research Hermes 3 is a generalist instruct model fine-tuned on Meta Llama 3.1, with strong reasoning, roleplay, multi-turn chat, tool calling, and structured JSON output.

Input price+1 free

From $0.0050/M

Avg speed

—

First token

—

Providers

24

+12 more

LFM 2.5 1.2B Instruct

LFM 2.5 1.2B Instruct is a compact language model in the LFM series, optimized for low-latency responses and efficient inference.

Input price+2 free

From $0.0010/M

Avg speed

—

First token

—

Providers

20

+7 more

MetaAILlama 3.1

Meta Llama 3.1 extends the Llama 3 family with stronger reasoning, tool use, and long-context support across 8B to 405B scales.

Input price

From $0.0007/M

Avg speed

—

First token

—

Providers

27

+14 more

MetaAILlama 3.3

Meta Llama 3.3 is an updated Llama 3 open model with improved instruction following, multilingual support, and efficient inference.

Input price

From $0.100/M

Avg speed

985 t/s

First token

0.45s

Providers

14

+1 more
Dec 25

MetaAIMeta Llama 3.3 Instruct

Meta Llama 3.3 Instruct is an instruction-tuned variant in the Llama series, optimized for following instructions and conversational tasks.

Input price

From $1.00/M

Avg speed

48 t/s

First token

0.93s

Providers

5

Feb 24

MetaAIDracarys Llama 3.1 Instruct

Dracarys Llama 3.1 Instruct is an instruction-tuned variant, optimized for following instructions and conversational tasks.

Input price+3 free

From $0.010/M

Avg speed

18 t/s

First token

0.59s

Providers

25

+13 more
Mar 31
  • 1
  • 2
  • 3
  • 12
  • 13