GLM-4.1v Thinking Flash by Zhipu AI is available through 13 API providers on LMSpeed. Compare API pricing from $0.0002 to $75.00 per million input tokens across providers. In speed benchmarks, the fastest provider reaches 96 tok/s.
Zhipu AI GLM-4.1v Thinking Flash is a reasoning model in the GLM series, designed for complex reasoning, problem-solving, and analytical tasks.
Также известна как
Сравните производительность по скорости и задержке у всех API-провайдеров.
| Провайдер | Скорость | Задержка | Тесты |
|---|---|---|---|
智谱 AI glm-4.1v-thinking-flash | 95.69 tok/s | 7.10s | 5 |
zhipu/glm-4.1v-thinking-flash | 84.80 tok/s | 7.46s | 250 |
zhipu/glm-4.1v-thinking-flash | 69.74 tok/s | 9.29s | 5 |
Показано 1–3 из 3 провайдеров
deepseek-v3
DeepSeek V3 is DeepSeek flagship MoE language model with 671B total parameters, delivering strong performance in reasoning, coding, and multilingual tasks at competitive inference cost.
deepseek-r1
DeepSeek R1 is a reasoning-focused language model in the DeepSeek series, designed for complex reasoning, problem-solving, and analytical tasks.
glm-4-7
Zhipu GLM-4.7 is a flagship GLM release from Zhipu AI with advanced Chinese-English reasoning, coding, and agent features.
gpt-oss
GPT-OSS is an open-weight language model family designed for self-hosted inference, research, and cost-efficient alternatives to proprietary GPT-class models.
qwen3
Alibaba Qwen3 is the Qwen family's flagship LLM series with dense and MoE variants, seamless thinking/non-thinking modes, and leading open-source performance in math, code, and agent tasks.
glm-4-6
Zhipu AI GLM-4.6 builds on GLM-4.5 with a 200K context window, stronger real-world coding, advanced reasoning with tool use, and improved agentic performance for complex multi-step tasks.
Рейтинги основаны на тестах, предоставленных сообществом, и периодических зондах работоспособности. Носит рекомендательный характер, не является официальными данными.