Browse canonical models across providers with performance and coverage highlights.
Xiaomi MiMo-V2-Omni is the omnimodal model in the V2 series on the Xiaomi MiMo API platform, supporting text, image, video, and audio understanding within a unified architecture. Pricing: 1x token consumption (baseline).
Input price+2 free
From $0.014/M
Avg speed
83 t/s
First token
3.43s
Providers
68
OpenAI GPT-3.5 Net is a language model in the GPT-3.5 series, offering general-purpose reasoning, code generation, and multimodal capabilities.
Input price
From $10.27/M
Avg speed
—
First token
—
Providers
5
MiniMax M2.5 HighSpeed is a fast and efficient language model in the MiniMax series, optimized for quick responses and high throughput.
Input price
From $0.0001/M
Avg speed
59 t/s
First token
6.38s
Providers
34
DeepSeek Prover v2 is a reasoning model in the DeepSeek series, designed for complex reasoning, problem-solving, and analytical tasks.
Input price+1 free
From $1.00/M
Avg speed
—
First token
—
Providers
9
Google Gemini Pro Vision is a multimodal vision-language model in the Gemini series, supporting both text and image understanding.
Input price
From $0.274/M
Avg speed
—
First token
—
Providers
5
Zhipu AI GLM-4V Flash is a multimodal vision-language model in the GLM series, supporting both text and image understanding.
Input price+1 free
From $0.010/M
Avg speed
56 t/s
First token
0.62s
Providers
19
Google Gemini Live 2.5 Flash is a realtime audio model in the Gemini series, supporting low-latency speech and conversational interactions.
Input price
From $1.47/M
Avg speed
—
First token
—
Providers
4
Colosseum Instruct is an instruction-tuned language model, optimized for following instructions and conversational tasks.
Input price
From $0.010/M
Avg speed
—
First token
—
Providers
7
Arctic Embed L is an embedding model, designed for generating vector representations of text for retrieval and semantic search.
Input price+1 free
From $0.010/M
Avg speed
—
First token
—
Providers
21
A multimodal vision-language model by Amazon in the Nova series.
Input price
From $5.00/M
Avg speed
—
First token
—
Providers
7
Zhipu AI GLM-4.1v Thinking FlashX is a reasoning model in the GLM series, designed for complex reasoning, problem-solving, and analytical tasks.
Input price
From $0.016/M
Avg speed
—
First token
—
Providers
16
Alibaba Qwen3.5 Max is a high-capability language model in the Qwen series, offering enhanced reasoning, code generation, and multimodal capabilities.
Input price+1 free
From $0.086/M
Avg speed
31 t/s
First token
29.33s
Providers
14
Alibaba Qwen3.5 Plus Thinking is a reasoning-focused variant in the Qwen series, designed for complex reasoning and problem-solving tasks.
Input price+1 free
From $0.010/M
Avg speed
—
First token
—
Providers
16
Microsoft Phi 3.5 MoE Instruct is a mixture-of-experts instruction-tuned variant in the Phi series, optimized for following instructions and conversational tasks.
Input price+2 free
From $0.010/M
Avg speed
—
First token
—
Providers
19
Zhipu AI GLM-4.5 X is a language model in the GLM series, offering general-purpose reasoning, code generation, and multimodal capabilities.
Input price
From $0.163/M
Avg speed
69 t/s
First token
12.39s
Providers
29
A fast and efficient language model by Alibaba in the Qwen 3.5 series.
Input price+2 free
From $0.010/M
Avg speed
115 t/s
First token
7.92s
Providers
65
Anthropic Claude 3 Haiku is a language model in the Claude series, offering general-purpose reasoning, code generation, and multimodal capabilities.
Input price
From $0.021/M
Avg speed
—
First token
—
Providers
21
IBM Granite 4.0 H Micro is a compact language model in the Granite series, optimized for quick responses and high throughput.
Input price
From $0.034/M
Avg speed
—
First token
—
Providers
10
Anthropic Claude Opus 4.6 Max is a high-capability language model in the Claude series, offering enhanced reasoning, code generation, and multimodal capabilities.
Input price
From $0.0014/M
Avg speed
—
First token
—
Providers
21
Italia Instruct is an instruction-tuned language model, optimized for following instructions and conversational tasks.
Input price
From $0.010/M
Avg speed
36 t/s
First token
0.49s
Providers
10
MiniMax M2.1 HighSpeed is a fast and efficient language model in the MiniMax series, optimized for quick responses and high throughput.
Input price
From $0.575/M
Avg speed
—
First token
—
Providers
21
Moonshot AI Kimi K2 Turbo is a fast and efficient language model in the Kimi series, optimized for quick responses and high throughput.
Input price
From $1.10/M
Avg speed
83 t/s
First token
3.73s
Providers
14
Zhipu AI GLM-4 Flash is a fast and efficient language model in the GLM series, optimized for quick responses and high throughput.
Input price+7 free
From $0.0000/M
Avg speed
31 t/s
First token
0.92s
Providers
59
DeepSeek Coder Instruct is a code-specialized variant in the DeepSeek series, optimized for code generation, debugging, and software development tasks.
Input price+1 free
From $0.010/M
Avg speed
—
First token
—
Providers
10