LogoLMSpeed
  • Home
  • Free
  • Models
  • Providers
  • Docs
LogoLMSpeed
LogoLMSpeed

The best API speed test tool

GitHubGitHubTwitterX (Twitter)Email
Product
  • Features
  • Pricing
  • FAQ
Leaderboard
  • Overview
  • Speed Ranking
  • Latency Ranking
  • Health Ranking
  • Input Price
  • Output Price
  • Reasoning
  • Coding
Models
  • All Models
  • GPT
  • Claude
  • Gemini
  • DeepSeek
  • Llama
  • Qwen
Free Models
  • All Free Models
  • Free GPT
  • Free Claude
  • Free Gemini
  • Free DeepSeek
  • Free Llama
  • Free Qwen
Resources
  • Speed Test
  • Provider Directory
  • Documentation
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 LMSpeed All Rights Reserved.Made by Nexmoe with ❤️

Model Library

Browse canonical models across providers with performance and coverage highlights.

Visible models
301
Active models
301
Providers covered
554
Model variants
29885
Showing 73-96 of 301 models

MiMo-V2-Omni

Xiaomi MiMo-V2-Omni is the omnimodal model in the V2 series on the Xiaomi MiMo API platform, supporting text, image, video, and audio understanding within a unified architecture. Pricing: 1x token consumption (baseline).

Input price+2 free

From $0.014/M

Avg speed

83 t/s

First token

3.43s

Providers

68

OpenAIGPT-3.5 Net

OpenAI GPT-3.5 Net is a language model in the GPT-3.5 series, offering general-purpose reasoning, code generation, and multimodal capabilities.

Input price

From $10.27/M

Avg speed

—

First token

—

Providers

5

MinimaxMiniMax M2.5 HighSpeed

MiniMax M2.5 HighSpeed is a fast and efficient language model in the MiniMax series, optimized for quick responses and high throughput.

Input price

From $0.0001/M

Avg speed

59 t/s

First token

6.38s

Providers

34

DeepSeekDeepSeek Prover v2

DeepSeek Prover v2 is a reasoning model in the DeepSeek series, designed for complex reasoning, problem-solving, and analytical tasks.

Input price+1 free

From $1.00/M

Avg speed

—

First token

—

Providers

9

GeminiGemini Pro Vision

Google Gemini Pro Vision is a multimodal vision-language model in the Gemini series, supporting both text and image understanding.

Input price

From $0.274/M

Avg speed

—

First token

—

Providers

5

ChatGLMGLM-4V Flash

Zhipu AI GLM-4V Flash is a multimodal vision-language model in the GLM series, supporting both text and image understanding.

Input price+1 free

From $0.010/M

Avg speed

56 t/s

First token

0.62s

Providers

19

GeminiGemini Live 2.5 Flash

Google Gemini Live 2.5 Flash is a realtime audio model in the Gemini series, supporting low-latency speech and conversational interactions.

Input price

From $1.47/M

Avg speed

—

First token

—

Providers

4

Colosseum Instruct

Colosseum Instruct is an instruction-tuned language model, optimized for following instructions and conversational tasks.

Input price

From $0.010/M

Avg speed

—

First token

—

Providers

7

Arctic Embed L

Arctic Embed L is an embedding model, designed for generating vector representations of text for retrieval and semantic search.

Input price+1 free

From $0.010/M

Avg speed

—

First token

—

Providers

21

Nova Premier v1

A multimodal vision-language model by Amazon in the Nova series.

Input price

From $5.00/M

Avg speed

—

First token

—

Providers

7

ChatGLMGLM-4.1v Thinking FlashX

Zhipu AI GLM-4.1v Thinking FlashX is a reasoning model in the GLM series, designed for complex reasoning, problem-solving, and analytical tasks.

Input price

From $0.016/M

Avg speed

—

First token

—

Providers

16

QwenQwen3.5 Max

Alibaba Qwen3.5 Max is a high-capability language model in the Qwen series, offering enhanced reasoning, code generation, and multimodal capabilities.

Input price+1 free

From $0.086/M

Avg speed

31 t/s

First token

29.33s

Providers

14

QwenQwen3.5 Plus Thinking

Alibaba Qwen3.5 Plus Thinking is a reasoning-focused variant in the Qwen series, designed for complex reasoning and problem-solving tasks.

Input price+1 free

From $0.010/M

Avg speed

—

First token

—

Providers

16

Phi 3.5 MoE Instruct

Microsoft Phi 3.5 MoE Instruct is a mixture-of-experts instruction-tuned variant in the Phi series, optimized for following instructions and conversational tasks.

Input price+2 free

From $0.010/M

Avg speed

—

First token

—

Providers

19

ChatGLMGLM-4.5 X

Zhipu AI GLM-4.5 X is a language model in the GLM series, offering general-purpose reasoning, code generation, and multimodal capabilities.

Input price

From $0.163/M

Avg speed

69 t/s

First token

12.39s

Providers

29

QwenQwen3.5 Flash

A fast and efficient language model by Alibaba in the Qwen 3.5 series.

Input price+2 free

From $0.010/M

Avg speed

115 t/s

First token

7.92s

Providers

65

ClaudeClaude 3 Haiku

Anthropic Claude 3 Haiku is a language model in the Claude series, offering general-purpose reasoning, code generation, and multimodal capabilities.

Input price

From $0.021/M

Avg speed

—

First token

—

Providers

21

Granite 4.0 H Micro

IBM Granite 4.0 H Micro is a compact language model in the Granite series, optimized for quick responses and high throughput.

Input price

From $0.034/M

Avg speed

—

First token

—

Providers

10

ClaudeClaude Opus 4.6 Max

Anthropic Claude Opus 4.6 Max is a high-capability language model in the Claude series, offering enhanced reasoning, code generation, and multimodal capabilities.

Input price

From $0.0014/M

Avg speed

—

First token

—

Providers

21

Italia Instruct

Italia Instruct is an instruction-tuned language model, optimized for following instructions and conversational tasks.

Input price

From $0.010/M

Avg speed

36 t/s

First token

0.49s

Providers

10

MinimaxMiniMax M2.1 HighSpeed

MiniMax M2.1 HighSpeed is a fast and efficient language model in the MiniMax series, optimized for quick responses and high throughput.

Input price

From $0.575/M

Avg speed

—

First token

—

Providers

21

MoonshotAIKimi K2 Turbo

Moonshot AI Kimi K2 Turbo is a fast and efficient language model in the Kimi series, optimized for quick responses and high throughput.

Input price

From $1.10/M

Avg speed

83 t/s

First token

3.73s

Providers

14

ChatGLMGLM-4 Flash

Zhipu AI GLM-4 Flash is a fast and efficient language model in the GLM series, optimized for quick responses and high throughput.

Input price+7 free

From $0.0000/M

Avg speed

31 t/s

First token

0.92s

Providers

59

DeepSeekDeepSeek Coder Instruct

DeepSeek Coder Instruct is a code-specialized variant in the DeepSeek series, optimized for code generation, debugging, and software development tasks.

Input price+1 free

From $0.010/M

Avg speed

—

First token

—

Providers

10

  • 1
  • 3
  • 4
  • 5
  • 13
+43 more
Apr 28
+18 more
Apr 9
+1 more
+9 more
May 19
+11 more
+8 more
+2 more
May 26
+4 more
+10 more
+18 more
Sep 5
+50 more
May 26
+10 more
+9 more
Feb 18
+8 more
+5 more
Apr 3
+41 more
Jun 3